


of find. F 


S pride 


adver FF 
and he 
he has 


f three 
d from 
l their 
‘lation- 
ly one 
several 
ines. 

ling ay 
reports 
have 3 
of ap: 
1OL too 


Is save 
sly the 
a large 
of the 
ay nol 
flice to 
ant. II 
ve will 
which 


YS. 


us olf 
~n and | 


und it 
ristens 
ind. A 
be its 
1 shall 


run, ay 
rer ref 
lenge: § 


s been 


| 
. got 


; open : 


4 MOND 


ad 
f 


21 A ae 


— 





\L OF & 


Journal of 


JUN y 


Advertising Resedtch 





Vol. 1, No. 4 





JUNE. 1961 





The Influence of Yeasaying Response Style 
WILLIAM D. WELLS 


Ad Recognition and Response Set 
VALENTINE APPEL AND MILTON L. BLUM 


Some Correlates of Coffee and Cleanser Brand Shares 
SEYMOUR BANKS 


How Reliable Is Aided Recall of TV Viewing? 
A. 8. C. EHRENBERG 


Letters JOHN C. MALONEY 
FRANK MEISSNER 





FEDERAL STATISTICS IN ADVERTISING 
ON METHODS 

RESEARCH IN REVIEW 

PUBLICATIONS RECEIVED 


EDITORIAL 


13 


32 
33 


34 
36 
41 
47 


48 





Published by the Advertising Research Foundation 








| Le2 
Journal of Advertising Research 


is published by the Advertising Research Foundation and sent 
free to its members. Single copies are available to eligible non- 
members of ARF at $10, and to ineligible non-members at 
$2.50. 


solicits original papers. Reports of findings are favored over 
theoretical discussion. Manuscripts should be submitted in 
triplicate, double-spaced, with references, tables and figures 
on separate pages. Authors receive 50 free offprints. Letters 
of comment and criticism are also invited. 


is intended for practitioners and users of advertising research. 
Limits on readers’ time and journal space require that papers 
be as short as clarity permits. 


is an open forum. Publication in it implies no endorsement of 
the writer’s purpose, methods or views by the Advertising Re- 
search Foundation, its board of directors, or any of its com- 
mittees. 


is edited by the ARF Technical Staff: 


Charles K. Ramond .. Technical Director 
Ingrid C. Kildegaard . Research Statistician 
Gwyn Collins Research Mathematician 
Thornton C. Lockwood . Research Associate 
George M. Shirey Research Associate 
Naomi Boretz Librarian 
Joey Harris . ee srieedions) sere 


© 1961 by the Advertising Research Foundation, Inc. 
3 East 54th Street, New York 22, N. Y. 


l' Is 
' 1 to 


: tion, 1 
F know! 
; makin 
» differe 
Pat the 
' “Role 
“to por 
| becaus 
Sand si 
» the w: 
F uli. It 
press | 
| Thi 
> the eff 
 vesear 

impor 

scales, 

questi 

tests te 
| The 

Tesults 
Feral st 


eo 


The 
the rese 
Univers 



































T IS SOMETIMES CONVENIENT to think of answers 
I to questions as being determined by informa- 
- tion, role and style. “Information” refers to all the 
knowledge the answerer can bring to bear when 
making his reply. It determines answers because 
' diferent respondents have different information 
> at their disposal and therefore answer differently. 
' “Role” refers to the way the respondent is trying 
to portray himself to the questioner. It is important 
_ because some roles encourage accurate responses 
}and some encourage falsification. “Style” refers to 
the way the respondent reacts to questions as stim- 
| uli, It influences answers because some people ex- 
» press themselves in one way, some in another. 
This report is concerned with style, and with 
the effects of a particular kind of response style on 
| tesearch results. It will show that style can be an 
important factor in studies which count on attitude 
‘cales, rating scales, personality inventories, survey 
questionnaires, “open-end” interviews, or projective 
| (ests to supply basic data. 
The notion that response style influences research 
‘Tesults is not new. In 1937 Lorge described a gen- 
tral stylistic tendency to respond affirmatively and 
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"The research described in this report was conducted by 
the research department of Benton & Bowles, and at Rutgers 
University with the aid of a Benton & Bowles research grant. 









The Influence of 
Yeasaying Response Style’ 


WituiaM D. WELLS 


Rutgers University and Benton & Bowles, Inc. 


Some people tend to say yes more easily than others, whatever question they 
are asked. Dr. Wells describes a test which identifies these yeasayers, and re- 
ports how they differ from naysayers in various other kinds of behavior. 


agreeably, and discussed its implications for re- 
search and theory. Singer and Young (1941), Cron- 
bach (1946, 1950), Hathaway (1948), Berg and Rap- 
aport (1954), Gaier and Bass (1959) and Hare 
(1960) have also published on this topic. 

Early work on response style emphasized the 
difficulties it can cause in analysis and interpre- 
tation of research data. (See for example Lentz, 
1938; Rubin, 1940; Rundquist, 1950; Philip, 1947; 
and especially the reviews by Cronbach, 1946 and 
1950.) While such difficulties are still of major 
interest, some recent research has indicated that 
response styles may be more important than origi- 
nally supposed—that, in addition to interfering 
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with accurate analysis and interpretation, they may 
be surface outcroppings of deep veins in character 
and personality. 

An example of this double emphasis is the work 
of Couch and Keniston (1960), who coined the 
terms “Yeasayer” and “Naysayer” to designate con- 
trasting response styles which both influence re- 
search results and serve as clues to the personality 
of the responder. Their studies, and the studies 
summarized by Jackson and Messick (1958) and 
Cronbach (1946, 1950) supplied the theoretical 
and empirical background of the research reported 
here. 


Descriptive Definition of Terms 

“Yeasayers,” as described by Couch and Keniston 
and as exemplified in the research summarized 
below, are perhaps best described as impulsively 
over-ex pressive. On personality inventories, attitude 
scales, and survey questionnaires, they tend to say 
“yes,”” to agree, to be enthusiastic and uncritical, to 
give high ratings to objects which impress them 
favorably. “‘Naysayers” by contrast are cautiously 
under-expressive. They are more apt to be con- 
trolled in their responses, careful, conservative and 
critical. They are more likely to say “no,” to be 
moderate rather than enthusiastic, to avoid com- 
mitting themselves unless they are sure of what 
they are doing. 


Operational Definition 

In the present research, response style was meas- 
ured by paper-and-pencil questionnaires designed 
to give short reliable measures of yeasaying and 
naysaying tendencies. One of these questionnaires 
and some data on its development are given in 
Table 1. 

It should be noted that the terms “Yeasayer” 
and “Naysayer” are not intended to imply a 
strict dichotomy. Like the terms “short” and 
“tall,” “Yeasayer” and “Naysayer” refer to alternate 
sides of a continuous distribution. There are de- 
grees of “yeasayingness” and “naysayingness,” just 
as there are degrees of “shortness” and “‘tallness;” 
and where one merges into the other is largely a 
matter of arbitrary definition. The answer to the 
question, “How many Yeasayers and how many 
Naysayers are there in the general population?” 
therefore depends entirely on where one sets the 
point of separation. 

In much of the research reported here the re- 
spondents were divided into two groups at the 
middle of the possible range of scores on YN-2. 
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With this division 40 to 50 per cent of those inter. 
viewed fell above the cutting-point, and 50 to 60 
per cent fell below it. 

Instead of adopting an arbitrary cutting-point, 
it would have been possible to have split each 
sample 50-50 by dividing at the median, but then 
some respondents classified as Yeasayers in one 
set of data would have been classified as Naysayers 
in another. This was considered undesirable. An 
additional consideration in favor of the arbitran 
cutting-point was that it separates respondents who 
gave predominantly positive responses from re. 
spondents who gave predominantly negative re. 
sponses. In other words, above-midpoint scores on 
YN-2 represent “‘yeasaying’”’ in fact as well as by 
definition. . 

It would also have been possible to have applied 
the terms “Yeasayer” and ‘“‘Naysayer” only to indi- 
viduals at the extremes. This procedure would have 
sharpened Yeasayer-Naysayer differences in many 
instances, but it would also have reduced the num- 
ber of persons classified in one group or the other. 
When samples were small it seemed wiser to accept 
the smaller difference between groups in order to 
gain the increased stability of results computed 
on broader bases. 

It is important to re-emphasize the premise stated 
earlier that behaviors influenced by yeasaying- 
ratings, questionnaire responses, answers in open- 
end interviews, etc.—are always influenced by in- 
formation and role, as well as by style. Because 
information and role sometimes combine to coun- 
teract the effects of style, Yeasayers do not always 
say “yes,” and Naysayers do not always say “no; 
Yeasayers do not always give high ratings, and 
Naysayers do not always give low ratings. Yeasay- 
ing and naysaying are tendencies, not ironclad 
regulations. 

In spite of the fact that yeasaying and naysaying 
are not absolutes, the work reported here suggests 
that response style is a potential problem wheneve! 
answers to questions supply research data. When 
conditions are right response style can be decisive 


RATING SCALES 
Tendency to Take Extreme Positions 

Yeasayers differ from Naysayers in the way the! 
use rating scales. Yeasayers are more inclined 
take extreme positions, to avoid caution and quali 
fication, to avoid neutrality. Their high rating 
therefore tend to be higher, and their low rating 
tend to be lower. This tendency is exhibited i 
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TABLE | 
THE YN-2 SCALE 


You can rate the following statements on a seven point scale as follows: 


(1) _ (2) (3) (4) (5) (6) 
strongly Disagree Slightly Neither Agree Slightly Agree 
Jisagree Disagree nor Disagree Agree 


Novelty has a great appeal to me. 
Let us eat, drink and be merry for tomorrow we die. 


I often make decisions on the spur of the moment. 
I really enjoy plenty of excitement. 

I’m apt to really blow up, but it doesn’t last long. 
It’s great fun just to mess around. 

I often say the first thing that comes to my mind. 
Here today, gone tomorrow . . . that’s my motto. 
When I talk, I tend to bounce from topic to topic. 
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11. Loften change my feelings about others. 

12. There is nothing so satisfying as to really tell someone off. 
13. I like to see people express their emotions. 

14. I crave excitement. 

15. My mood is easily influenced by the people around me. 

16. It’s a wonderful feeling to sit surrounded by your possessions. 
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___17. I tend to act on impulse. 

—_—18. I would like to have breakfast in bed every morning. 

—___19. Ioften lose my temper. 

___20. Movement, travel, change, excitement . . . that’s the life for me. 


This scale is the second of two scales used to measure yeasaying tendencies in 
the general population. 

The first scale, YN-1, consisted of twenty items from Couch and Keniston’s 
Tables 8 and 9 (Couch and Keniston, 1960). Reports by interviewers indicated 
that some respondents had difficulty understanding the reverse-scored items, that 
some of the vocabulary was not understood by respondents with little education, 
and that some of the item content was too personal. ‘These reports, and an in- 
ternal-consistency item analysis of YN-1, led to YN-2. 

The best YN-1 items were retained practically unchanged. Some of the reverse- 
scored items were reworded so that they could be scored in the positive direction. 
The vocabulary was simplified, and a few new items were added. In making these 
changes, every effort was made to stay as close as possible to the original items, so 
that information obtained with YN-2 could form a feed-back loop with basic 
theory. 

YN-2 is scored by summing the numerical responses to the twenty items. Omis- 
sions are counted as 4’s. ‘The lowest possible score therefore is 20, the highest pos- 
sible score is 140, and 80 is the midpoint of the possible range. Average scores 
have varied from group to group; most have been between 75 and 79. Most dis- 
tributions have been approximately normal. 

YN-1 had a (corrected) split-half reliability of .63. Similar coefficients for YN-2 
have ranged from .82 to .92. 

YN-2 is presented here as an interim measure, not as a finished product. It was 
an improvement over YN-1, and there are many reasons to believe it can be im- 
proved still further. 
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There are few things more satisfying than to really splurge on something. 












































FIGuRE | 
RATINGS OF TWO ADVERTISEMENTS 
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Respondents were adults interviewed in Newark and vi- 
cinity. Scale was YN-2. Yeasayers defined as 80 and above; 
Naysayers as 79 and below. Significance of difference between 
Yeasayers’ ratings and Naysayers’ ratings tested by casting 
data into two-by-two tables (7 vs. 6 and below for Ad A, 1 vs. 
2 and above for Ad B) and applying Chi Square. Differences 
were significant at .05 level. 

Figure 1, which shows how Yeasayers and Naysay- 
ers evaluated two especially selected advertisements 
on a simple “unattractive-attractive” rating scale. 
Ad A, selected to elicit favorable reaction, was a 
four-color food ad with obvious appetite appeal. 


Ad B was a gack and white patent medicine ad- 


‘TABLE 


LIKE-DISLIKE RATINGS OF SIXTEEN WELL-KNOWN 


Gasolines N Y Difference 
A 2.0 2.6 0.6* 
B 19 2.6 0.7* 
Cc 1.7 1.3 —,4* 
D 04 0.5 0.1 
Coffees N Y Difference 
A 3.3 3.0 —0.3* 
B 1.3 1.7 0.4* 
C 0.9 1.2 0.3 
D 0.2 0.4 0.2 





vertisement with a grim and graphic illustratiof@@4 i! 
of infected sinus cavities. As Figure 1 shows, both of Yea 
Yeasayers and Naysayers liked ad A and dislikedf* larg 
ad B, but Yeasayers’ ratings were more extreme in}@ | 
both directions. The rating differences were not they 1: 
large in this instance, but in routine testing differ. If th 
ences of this magnitude frequently carry the deci- the ou 


sion. pens te 
better 


the rig 
easay 
own. 














Figure 1 shows that Yeasayers can react nega. 
tively if the stimulus is negative enough. Negative 
ratings from Yeasayers are rare, however, partly 
because Yeasayers tend to avoid giving low ratings 
unless the stimulus is odious, and partly because 
the stimuli used in rating studies are more ofte#~ | 
good than bad. It takes careful selection to get a * 
stimulus as negative as this one. betwee 

Table 2, a summary of some ratings on a +5 t st andl 
—} like-dislike rating scale, shows a more usual The 
pattern: in these data, Yeasayers’ ratings are gen ot an 
erally higher than Naysayers’ ratings. This reaction pares 
has been observed repeatedly—in ratings of popu: ogee 
lar brands, in like-dislike ratings of foods and bev. ihe Ni 
erages, in ratings of the desirability of various per; aind 
sonality traits, in ratings of well-known political’® ihe 
figures, in ratings of books, magazines and tele}, 
vision programs. The pattern appears to be quit 
general: ratings made by Yeasayers are likely to b 
higher than ratings made by Naysayers if the rate( 
object is not obnoxious. 

Implications. Yeasayers’ fondness for the hig 
favorable rating can be worrisome when it is neces 
sary to compare ratings made by a group containin 
large numbers of Yeasayers with ratings made by 
a group containing large numbers of Naysayers 
The consequences of such mismatching are com 
plex, because they depend at least in part on th¢ 













brand 


ppare 
een i 


real relationship between the objects being rated} p,.,., 
If the objects rated are in fact of equal value 





fagainst 
9 so that 
thes: 
BRANDS BY NAYSAYERS (N) AND YEASAYERS (Y) A ” 
ndicat 
Cameras N Y Differenc E | 
A 1.8 2.5 oe prorthy 
B 1.5 2.4 0.9* ome O 
C 0.8 1.3 0.5* 
D 06 0.8 02 y Som 
hat ot 
Headache , 
Remedies N Y Different aters, 
A 29 3.5 0.6* FMent n 
B 2.5 3.0 0.5° te 
C 2.0 25 05° Paty tc 
D 0.6 1.4 0.8* eparat 


x9 


* Difference statistically significant at the .05 level. Figures are mean ratings by 166 Naysayers and 112 Yeasayers. Of the Yeasayers 52 ™ Hifferer 
males, and 60 were females. Of the Naysavers, 86 were males and 80 were females. All raters were adults living in Des Moines, Iowa, 2 


Omaha, Nebraska. Steps in the rating scale were designated by numbers only, with +5 defined as “like very much’’ and —5 defined as ‘‘dislike V¢ atings 





much.” The scale used to measure yeasaying was YN-2. The separation point between Yeasayers and Naysayers was a score of 80. If only extre Q : 
groups had been used (for example only Yeasavers scoring above 90 and only Naysayers scoring beiow 70), the entries in the Difference colum ngs ol 


would have been approxiniately doubled 
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stration and if one rating group contains a large number 
vs, both! Yeasayers while the other rating group contains 
dislikeaf# arse number of Naysayers, the Yeasayers’ tend- 
ency to give high ratings will make the object 
ere notf they rate appear to be superior—a false conclusion. 
@ differ If the objects rated are in fact different in value, 
he degfthe outcome will depend upon which group hap- 
pens to rate which object. If the Yeasayers rate the 
better of two objects, the rated difference will be in 
Negative the right direction but somewhat inflated, because 
, partly Yeasayers rate things up and Naysayers rate things 
ratings down. If the Yeasayers rate the poorer of the two 
hectued objects, the real difference between the objects will 
re ofterd?™ obscured. When the real difference between ob- 
to get a jects is small, and when the response style difference 
~  tbetween rating groups is large, real differences can 
be reversed. 

The ratings in Table 2 provide an illustration 
of such reversals. In these ratings, Yeasayers and 
Naysayers agreed on the rank order of the rated 
if aes, brands, but in three of the four product categories 
and beyfthe Naysayers’ rating of their favorite brand was 
ious per lower than the rating assigned by the Yeasayers 
politica to the brand they liked second best. The conse- 

quences of this difference can be understood by 


reine in 


ct Nega- 


a +5 t 
re usual 
are gen 
reactiol 


nd tele? ap , 
be quit imagining that conditions had made it necessary 
ly tob to assign brand A to one group for rating, and 


brand B to the other. If brand A had been assigned 
to the Yeasayers and brand B to the Naysayers, the 
apparent difference between the brands would have 
been in the right direction but inflated. If brand 
_._. §B had been assigned to the Yeasayers and brand A 
ntaining ; 
nade bt” the Naysayers, brand B would have (incorrectly) 
ayeniind received the higher rating, just because it was the 
feasayers who rated it. Reversals of this kind have 
been observed often enough to make them appear 
to be fairly common. 

Preventive Action. The obvious precaution 
lagainst reversals of this kind is to arrange things 
so that objects to be compared are always rated by 
the same individuals. The evidence in Table 2 
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am a that such a precaution would be well 
rfperenc ° e ° . ° . 
07 vorthwhile, even if it required compromise with 
0.9* fsome other aspect of study design. 
0.5" 


Sometimes it is impossible to arrange things so 
hat objects to be compared are rated by the same 
Differen@Aters, NO matter how desirable such an arrange- 
nent might be. For example, it is sometimes neces- 
ary to make comparisons among objects widely 
€parated in space or time, as when supervisors at 
Hifferent factories evaluate their foremen, or when 
atings cf one test product are compared with rat- 
ngs of test products evaluated earlier and presently 
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unavailable. Some rating tasks require unrepeatable 
experiences, as when students evaluate their teach- 
ers at the end of a semester, or when patients or 
therapists evaluate the effects of therapy. In situa- 
tions such as this, when different groups must rate 
different objects, measurement of the raters’ yea- 
saying tendencies provides a check on the degree to 
which such tendencies might be suspected of in- 
fluencing the findings. 

Corrective Action. Knowledge that rating groups 
differ markedly in yeasaying provides the oppor- 
tunity for corrective action. It is sometimes possible 
to bring groups into line by rejecting the responses 
of certain raters. When this procedure cannot be 
used, either because it would reduce the size of 
the rating groups too much or because yeasaying 
tendencies are correlated with attitude toward the 
object being rated, it is sometimes possible to re- 
move response style bias by partial correlation. 
When none of these procedures can be used, it may 
be necessary to accept response style bias as an 
unfortunate but unavoidable characteristic of the 
data. Even under these circumstances, it is better 
to know about the bias than to be misled by it. 

Correlation Problems. Another difficulty remains. 
Segmentation of respondent groups. by variables 
like age, sex, income or education frequently 
produces response style differences among the sub- 
groups so defined. Segmentation by age, for ex- 
ample, is likely to introduce a response style differ- 
ence among age groups because Yeasayers tend to 
be younger and Naysayers tend to be older. This 
situation can be particularly difficult to handle 
because standard remedies for mismatching cannot 
always be applied. Matching by rejecting respond- 
ents is practically certain to destroy the repre- 
sentativeness of the subdivisions. Statistical control 
devices frequently require assumptions obviously 
violated by the data. Occasionally the only thing 
left is to recognize the existence of response style 
bias and assess its influence. 

The problem of mismatched groups is com- 
pounded when a measurement which is itself sus- 
ceptible to influence by yeasaying is used as the 
basis for sample segmentation. If, for example, re- 
spondents who give high, medium and low ratings 
to a particular magazine are compared with respect 
to their attitude toward a variety of products, re- 
sponse style bias has a double opportunity to affect 
the findings—once in the self-rating of attitude 
toward the magazine, and once in the statement of 
attitude toward the products. When measures sub- 
ject to response style bias are correlated, positive 
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FIGURE 2 extre! 
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Respondents were adults living in Newark and vicinity. Scale used to define Yeasayers and Naysayers was YN-2. Separation point was 80. 
‘ach square represents a mean based on 100 ratings. Scales used to rate ads were ‘“‘attractive—unattractive,” ‘interesting—uninteresting,” then 1 
‘convincing—unconvincing.”’ Scale steps were labeled “extremely,” ‘‘very’”’ and “‘slightly.”’ ideal 
relationships are likely to appear where none exist, stimuli are compared with ratings of negative stim The 
genuine positive relationships are likely to be in- uli. Yeasayers’ ratings are more extreme, so i rectly 
. . . . A . . . . . . ‘ ’ 
flated, and genuine negative relationships are likely count more heavily in determining combined re§ serie, 


to be obscured. 


Discrimination 

The data in Figure 2 came from a study in which 
100 men and 100 women rated eight advertisements 
on three six-point scales. Each square in this Figure 
represents the mean rating of a single ad on one 
of the scales by the men or by the women. There 
were significant differences among ads, among 
scales, and between sexes, but these differences are 
not now at issue. The point illustrated is that Yea- 
sayers’ ratings were higher than Naysayers’ ratings 
when both groups gave the stimulus a positive rat- 
ing; that the Yeasayer-Naysayer difference persisted 
when one group rated the stimulus positively and 
one group rated the stimulus negatively; and that 
the difference disappeared (in fact was slightly 
reversed) when the ratings by both groups were 
negative. 

Figure 2 gives further illustration of response 
style difference, and it points up two additional 
problems. The first is that Yeasayers are likely to 
contribute more than their proportionate share of 
the variance when ratings of positive or neutral 
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sults. If Yeasayers were just like Naysayers excep 
for increased enthusiasm, this disproportionate in 
fluence would not be undesirable. However, studies 
by Barnes (1956), Couch and Keniston (1960), ani 
Webster (1960) among others indicate that Yeasay 
ers and Naysayers differ sharply in their outlook on 
the world in general. Qualitative—as distinguishet 
from merely quantitative—differences in reaction 
therefore seem likely in many situations. Add t 
this difficulty the fact that Yeasayers’ ratings tend 
to be less reliable than Naysayers’ ratings, and | 
can be seen that having Yeasayers carry more ‘hal 
their share of rating variance is not a good idea al 
all. 

A second research problem arises when the it 
vestigator pays special attention to extreme reac 
tions. If theory indicates that attitudes in a desig _ 
nated group should be extreme, and if the grou Pia 
in question happens to contain a large supply °§ Much B 
Yeasayers, response style will work in the direction yah 
of confirmation. Note that response style assis 
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confirmation only when the hypothesis is in (0  adut 
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appropriate direction. If, for one reason or anothe! YN] 





extreme ratings are expected from a group contain- 
ing a large proportion of Naysayers, response style 
will work against the expectation. Note also that 
these influences will be most evident in studies 
which compare ratings of disliked objects (such as 
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Ins : ; : 

“a products, persons or races) with ratings of objects to- 
y : : : 

Yeasayer, | Ward which attitudes are either favorable or neu- 

ee 


tral. If the objects rated are all good, all bad, or 
all indifferent, response style is likely to influence 
over-all rating level, but not discrimination. 


Susceptible Objects 

Some kinds of ratings may be practically immune 
to response style biases, while others may be highly 
susceptible. When the rating task is definite, spe- 
cific and clear-cut, and when the rater knows ex- 
actly what is expected of him, it seems reasonable 
to assume that rating behavior will be determined 
more by information and role than by style. But 
when the rater does not feel strongly one way or 
the other; when he is unfamiliar with the object 
being rated; when the best information he can 


: 
: 


D bring to bear is fragmentary, vague, or half-forgot- 
ten; and when he cannot read the expected re- 
sponse from the context of the rating situation; 

bet then response style biases would seem to have an 


ideal opportunity. 

These generalizations have not been tested di- 
rectly, but they agree with data from the present 
series of studies and with observations made by 
others who have investigated response style biases 
(Cronbach, 1950). They imply that Yeasayer-Nay- 
sayer differences will arise in direct proportion to 
the vagueness of the rating task required; and that, 
ifa rating task is clear enough, response style biases 
can be minimized if not eliminated. This point of 
view is of course completely consistent with the 
theory underlying projective tests: if the stimulus 
situation is vague and undefined, its influence on 
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responses will be small, and the influence of re- 
sponder characteristics will be large. 


Susceptible Scale Formats 

It also seems probable that some scale formats 
invite response style biases while others reduce 
them. In line with the notion that response style 
bias and vagueness of rating task go together, it 
seems possible that generalized letter or number 
scales—which provide a minimum of guidance— 
would be more susceptible to yeasaying influence 
than would more structured verbal scales which 
provide a definite and specific frame of reference. 


PAIR COMPARISONS 

The influence of Yeasaying response style can be 
abolished by presenting stimuli in pairs and forcing 
a choice between them. In weighing the pros and 
cons of pair comparisons, freedom from response 
bias is one consideration. 

If “no difference” judgmenis are permitted in 
pair comperisons, response style can influence. out- 
come because Yeasayers are generally readier to 
commit themselves. The observation that there are 
consistent individual differences in tendency to be 
noncommittal agrees with results obtained in a 
variety of rating situations (Johnston, 1948; Lorge, 
1937; Mersman, 1948). 

Table 3 shows response style differences in pair 
comparison data from two contrasting rating situa- 
tions. The data on the left in Table 3 came from 
a blind-product pair comparison rating of two al- 
most identical instant coffees. In this comparison, 
where the difference between the stimuli was diffi- 
cult to perceive, the Naysayers tended to avoid 
saying “much better’ and tended more to say “no 
preference.” 

The data on the right in Table 3 came from pair 
comparisons of five pairs of television programs. In 


TABLE 3 


RESPONSE DIFFERENCES IN PAIR COMPARISON DATA 














1 idea al . , : , 
“Blind” Taste Test Comparison Pair Comparison of Five 
of Two Good Instant Coffees Pairs of Television Programs 
in 

1» the! y By By ) 
me reac} 28 Naysayers 28 Yeasayers 48 Naysayers 45 Yeasayers 

a desig Base: 28 Responses 28 Responses 238 Responses 217 Responses 

Type of 

1e QTOulE Response o o%, % o 
upply 0! Much Better 18 50 50 45 

‘rectiolg dightly Better 50 25 99 36 
dire No Preference $2 25 28 19 


le assist 


i th Respondents in the coffee test were housewives living in the New York metropolitan area. Respondents in the TV program comparison were 
§ il adults living in the vicinity of Newark, New Jersey. About half the TV program raters were males; about half were females. A few TV 
anothel YT omitted one or two responses, so total responses do not equal five times number of raters. The scale used to measure Yeasaying was 
d 


- Separation point was scale midpoint. 
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FIGURE 3 





PAIR COMPARISON EXPERIMENT WITH LIFTED WEIGHTS 


A- "No Difference" Judgments 
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By 20 Yeasayers 


---—-- By 34 Naysayers 


Stimuli were small bottles, containing varying number of pennies. Each step along horizontal dimension represents a difference “of one ad- 
ditional penny. Each respondent judged each size difference five times in random order. Respondents were adults living in or near Living- 
ston, New Jersey. Scale used to identify Yeasayers and Naysayers was YN-2. Separation point was 80. 


this situation, where the differences between the 
stimuli were obvious, Naysayers’ tendency to say 
“no preference” was still in evidence, but the 
difference in “much better” judgments disappeared. 
These data suggest that response style is more likely 
to influence pair comparison results when the judg- 
ment is not easy. 

Figure 3 shows the results of a pair comparison 
experiment using lifted weights. The steps along 
the horizontal dimension of each diagram are in- 
creasing equal intervals of difference between 
weights, beginning with zero difference at the ex- 
treme left. The steps on the vertical dimension are 


8 


99 66 


per cent of “no difference,” “slightly heavier,” and 
“much heavier” responses. Diagram 3A shows that 
the Naysayers made consistently higher proportions 
of “no difference” judgments, both when the real 
difference was zero and when it was not. Diagram 
3B shows that the Yeasayers had a stronger tend- 
ency to respond “slightly heavier” when the real 
difference between weights was small, and that this 
tendency reversed when the real difference between 
the weights became too large to be classified as 
“slight.” Diagram $C shows that the Yeasayers had 
a stronger tendency to respond “much heavier" 
through almost all the series. 
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In Diagram 3D, the three sets of curves are plot- 
ted on the same coordinates. The cross-over points 
(indicated by small vertical arrows) show where “no 
difference” became “slightly heavier,” and where 
“slightly heavier” became “much heavier.” From 
these curves it is clear that the Naysayers required 
a larger difference between stimuli before they 
would shift from one response to the next above 
it, This finding is consistent with the coffee test 
data in Table 3, and it demonstrates at least one 
of the reasons Yeasayers give high ratings: com- 
pared with Naysayers, they require less stimulus 
input before they will shift from a noncommittal 
response to a moderate response, and from a mod- 
erate response to an extreme response. Perhaps 
this is just another way of saying that Yeasayers are 
less cautious. 


QUESTIONNAIRE ITEMS 

When questionnaire items call for “yes or no” 
answers, Yeasayers say “‘yes’” more often than Nay- 
sayers do. This tendency is illustrated in the first 
two columns of Table 4 which show per cent “yes” 
answers to a question about recent purchases. The 
question was “Did you buy any ( ) within 
the past two weeks?” with the blank filled by the 
name of one of the listed categories. 

Considering these data alone, it would be rea- 
sonable to assume that Yeasayers are simply better 
customers. However, when a variety of studies ap- 
pear to show that Yeasayers buy more, do more, 
read more, watch more, like everything better, are 
more interested in more occupations—even eat ap- 
ple pie more often and see more out-of-state license 
plates—credulity gets stretched beyond the breaking 
point. 


Question: “Did you buy any ( 











) within the past two weeks?” 


“Yes” Answers 


Further evidence that response bias is at work 
is furnished by the data in the right-hand columns 
of Table 4. These data came from a subsample of 
the respondents who supplied the data in the left- 
hand columns, but this time the question did not 
require a “yes-no” answer. Respondents in the sub- 
sample were recontacted by telephone approxi- 
mately three weeks after the original interview and 
asked “When did you last buy?” each of the com- 
modities in question. When their answers were 
coded into ‘‘within two weeks” vs. “two weeks or 
over” the purchasing difference suggested by the 


original data disappeared. 


Sampling Implications. If a filter question of the 
“yes-no” type (e.g.: “Have you rented an automo- 
bile within the past six months?” or “Do you plan 
to go to college after graduation?”) is used to select 
respondents for further interviewing, too many 
Yeasayers and too few Naysayers will be drawn into 
the sample. To the extent that Yeasayers and Nay- 
sayers differ in reaction to the object in question, 
a sample drawn in this way will be unrepresenta- 
tive of the intended population. 

Estimation of Behavior from Questionnaire Re- 
sponses. If certain segments of the population con- 
tain disproportionate numbers of Yeasayers or Nay- 
sayers, their behavior as measured by questionnaire 
questions will be systematically overestimated or 
underestimated. Although available information is 
not entirely clear, it appears that younger age 
groups and lower income groups contain a larger 
proportion of Yeasayers, while middle age groups 
and middle income groups contain a larger pro- 
portion of Naysayers. It seems reasonable to as- 
sume that response style differences are associated 
with other demographic variables as well. 


TABLE 4 
REPORTS OF PURCHASES BY YEASAYERS AND 
NAYSAYERS RELATED TO FORM OF QUESTION 


= 


“When did you last buy (_____)? 


Answers Indicating Purchase 
Within Past Two Weeks 






















By 45 
Naysayers 

Product 

Category %o 
Package Soap or Detergent 69 
Toilet Soap 60 
Toothpaste 52 
Ground Coffee 49 
Instant Coffee = 
Cigarettes 36 
Headache Remedies 27 
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Respondents were housewives living in Newark, New Jersey. Scale used was YN-1. Separation point was midpoint of scale. 





By 29 By 24 By 15 
Yeasayers Naysayers Yeasayers 
% % % 

86 79 67 
79 83 93 
73 63 53 
59 58 53 
52 54 47 
62 46 53 
35 21 13 


TABLE 5 
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PER CENT OF YEASAYERS, NAYSAYERS AND The 
MIDDLESAYERS AMONG RESPONDENTS WHO CLAIM TO that 1 
READ “EVERY ISSUE” OR “MOST ISSUES” OF FIVE MAGAZINES sional 
Men Women ing h 
Magazine Naysayers Middlesayers Yeasayers Naysayers Middlesayers Yeasayer; emplc 
A 22 43 35 29 41 30 quest 
B 2 43 25 35 40 25 saying 
Cc 32 46 22 38 41 21 
D 45 40 15 37 43 20 gsome 
E 39 44 17 50 33 17 Baction 
Per Cent of in 
Total Sample: 34 44 22 36 42 22 I 
Percentages total 100 across for men and women. Bases for percentages ranged from 23 to 109. Individual bases withheld to prevent identi- are ¥ 
fication of magazines by name. Sample: 200 men and 200 women living in Newark, New Jersey area. Scale for identifying response style variol 
group was YN-2. ‘“‘Naysayers”’ scored below 70; ‘‘Middlesayers’ scored 70-89; ‘‘Yeasayers’” scored 90 and above. Three-way division was 
used because available sample was larger than usual. conte! 
are m 
fluenc 
Implications for Media Research. The kind of significant correlations. Factor analysis of such data sponse 
sample segmentation normally used in media re- will pick up yeasaying as the first centroid factorfremed 
search is likely to create groups containing unusual (Couch and Keniston, 1960). efficier 
numbers of Yeasayers and Naysayers. Table 5 shows If the behaviors or opinions designated by ajremed 
the Yeasayer-Naysayer balance among the readers battery of “yes-no” or “agree-disagree” questions} Anc 
of five well-known magazines. In line with Yea- actually are correlated, positive correlations will be wheth 
sayers’ tendency to over-rate and over-report, and inflated and negative correlations will be decreased. saying 
Naysayers’ tendency to be conservative, it would This may be the reason it has proved so difficult tof the Y 
not be surprising to find that readers of magazines construct Likert-type attitude scales with positively) ished 
A and B had overestimated purchases, buying plans worded items representing the negative scale pole.jan Ov 
and other attitudes; while readers of magazines geneo 
D and E had done the opposite. Note that system- Susceptible Question Forms and Susceptible Objects }when 
atic under-reporting or over-reporting cannot be It seems likely that certain question forms arepspons 
diagnosed from questionnaire results alone. If especially ‘susceptible to response style influences. ¢xpre: 
readers of magazines D and E really were a lot The data in Table 4 show the difference between back : 
more active than readers of magazines A and B, “a “yes-no” and an “open-end” question. Other al- tinuec 
questionnaire results would show the difference; terations in question form would be expected to intern 
however the size of the difference would be under- show other kinds of differences. Cronbach (1950) contir 
estimated because readers of D and E would be has suggested that items of the “yes-no,” “true Per! 
under-reporting their behavior while readers of false,” “agree-disagree,” “like-indifferent-dislike’ pler, : 
A and B would be exaggerating theirs. It appears type are especially susceptible. Data from the pres index. 
likely that “Naysayers’ magazines” are systemati- ent studies suggest that all kinds of check lists are quiesc 
cally penalized by research which depends upon susceptible as well. Open-end questions are at least with I 
easier 


questionnaire responses, while “Yeasayers’ maga- 
zines” are given a systematic advantage. 

The same principle would hold true in question- 
naire studies of the audiences of television pro- 
grams. 

Implications for Factor Analysis of Question- 
naires. Another research implication of response 
style difference concerns the effect it may have on 
attempts to understand the relationships among 
large batteries of questions. With Yeasayers tending 
to say “yes” and Naysayers tending to say “no,” 
cross tabulation of a miscellaneous group of unre- 
lated “yes-no” or “agree-disagree” questions will 
show a large number of spurious but statistically 
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immune from a general tendency to agree or dis 
agree; and, of course, a strict forced-choice format 
abolishes response style influence altogether. 

It also seems likely that certain classes of ques 
tioning would be more susceptible to response style§“”€me 
influence than others. Questions which are vague than : 
or ambiguous, questions which permit overgen-{j°"tct 
eralization or ask for rough guesses, questions to Test, 
which the respondent doesn’t know the answer—all scored 
invite response style to operate. On the other hand, This « 
clear-cut, specific, direct questions which ask for tee” 1 
information the respondent has at his command#!949; 
and is willing to reveal are likely to be influencedfof all 
little if at all. taking 


setting 
gated 
“style’ 
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DISCUSSION 

The purpose of this report has been to show 
that response style can have an important—occa- 
sionally crucial—effect on research results. Yeasay- 
ing has been shown at work and play in studies 
employing rating scales, pair comparisons and 


’easay 
2 bi questionnaires. Some research implications of yea- 
25  [saying’s influence have been demonstrated, and 
: some suggestions made for remedial or preventive 
17 faction. 
99 Important research questions still remain. How 


fare Yeasayers and Naysayers distributed through 
“Style (various segments of the population? What kinds of 
n W"S Tcontent, what scale formats, what question forms 
are most and least affected by response style in- 
fluences? What measures prevent or counteract re- 
ch data§sponse distortions? Under what circumstances are 
| factor$@remedial measures beneficial in terms of over-all 
eficiency, and under what circumstances is the 
1 by a remedy worse than the disease? 

1estions| Another important still unsettled question is 
will be whether the present approach to measuring yea- 
-reased,) saying is the best that can be devised. In developing 
icult to§the Yeasayer concept, Couch and Keniston ban- 
sitivelypished the influence of item content by obtaining 
le pole, an Over-all Agreement Score from a large, hetero- 
geneous and content-balanced set of items. But 
bjects | when they assembled items to form an Agreeing Re- 
‘ms arep sponse Scale, item content in the form of impulsive 
uences,@eXpressiveness vs. conservative self-control came 
yetweeng back in. The developmeni of YN-1 and YN-2 con- 
ther al-§ tinued this process, and refinement of YN-2 through 
cted to§internal consistency item analysis would certainly 

(1950)@continue it still further. 

“trues, Perhaps a different approach would yield a sim- 
dislike’§Pler, more accurate, more efficient response style 
1e pres index. Bass (1956) has worked out a “social ac- 
ists aregquiescence” scale based on degree of agreement 
at least) With homey sayings. Items of this kind might prove 
or dis§¢4sier to use and more effective in survey research 
format} Settings. Jackson and Messick (1958) have investi- 
gated statistical techniques for deriving separate 
“style” and ‘“‘content” scores from one set of meas- 
urements. Separate scores could prove more useful 
sthan scores on scales like YN-2, in which style and 
scontent are confounded. Berg’s Perceptual Reaction 
Test, a collection of geometric designs, can be 
scored for tendency to rate the designs favorably. 


















yf ques: 
se style 
+ vague 
vergen: 
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ver—all 
r hand, [his approach opens the possibility of a “content- 
ask forfj free” response style index (Berg, Hunt, and Barnes, 


mmand#}!949; Lewis and Taylor, 1955). Systematic tryout 
juencedpof all these alternatives would seem worth under- 
taking, 
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One other basic issue needs some discussion. The 
emphasis in this report has been on yeasaying re- 
sponse style as an artifact which interferes with 
analysis and interpretation of research data. Little 
has been said about its meaning as a symptom of 
deep trends in personality. Yet when yeasaying is 
defined by scales like YN-2, every response differ- 
ence between Yeasayers and Naysayers is poten- 
tially a matter both of style and of deep-seated 
difference in reaction. Yeasayers claim to buy more 
of a certain product than Naysayers do. How much 
of this difference can be attributed to the way the 
two groups answer questions? How much of it 
reflects a real difference in purchasing? 

Work on this complex problem has not advanced 
very far, but some pieces of evidence are now avail- 
able. In intensive clinical interviews with their 
college-student subjects, Couch and Keniston found 
fundamental differences between Yeasayers and 
Naysayers in basic traits of personality. Such differ- 
ences would lead one to expect important differ- 
ences in behavior. In the course of some of the 
research reported above, interviewers were required 
to rate their respondents on semantic differential 
scales designed to tap a set of traits and qualities 
evident in ordinary social interaction. The differ- 
ences between ratings of Yeasayers and ratings of 
Naysayers were not large, but they conformed to 
expectations based on theory, and a number of 
them were statistically significant. This finding too 
supports the premise that Yeasayers and Naysayers 
differ in overt behavior. 

In another study, not yet reported, housewives 
were asked about their use of brands in a variety 
of product categories, and replies were verified by a 
pantry check. In line with their usual enthusiasm 
for responding, Yeasayers outclaimed Naysayers in 
nine categories out of ten; but in four of the ten 
categories, Yeasayers actually had significantly more 
of the commodity on hand. In a kindred study, 
respondents were asked whether they remembered 
any of a set of advertisements they had seen. If 
they claimed recall, they were asked further ques- 
tions designed to let them prove it. As usual, the 
Yeasayers outclaimed the Naysayers, but they also 
proved more recall. The difference in proved recall 
was not as great as the difference in claimed recall, 
but it was clearly there. 

Taken together, these findings provide compel- 
ling evidence that Yeasayers differ from Naysayers 
in behavior other than response to questionnaires. 
Considering the nature of the variable, it is rea- 
sonable to guess that these behavior differences 
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extend beyond purchasing patterns and memory 
for advertisements to include differential exposure 
to different magazines and TV programs, differences 
in impulse buying, differences in receptiveness to 
new products, differences in susceptibility to per- 
suasion. Investigation of these differences, and sep- 
aration of behavior differences from response style 
artifacts, present exciting challenges for the future. 


REFERENCES 


Barnes, E. H. Response Bias in the MMPI. Journal of Con- 
sulting Psychology, Vol. 20, 1956, pp. 371-374. 

Bass, B. M. Development and Evaluation of a Scale for Meas- 
uring Social Acquiescence. Journal of Abnormal and So- 
cial Psychology, Vol. 53, 1956, pp. 296-299. 

BerG, I. A., W. A. Hunt AND E. H. Barnes. The Perceptual 
Reaction Test. Evanston, Illinois: I. A. Berg, 1949. 

Bere, I. A. AND G. M. RApAport. Response Bias in an Un- 
structured Questionnaire. Journal of Psychology, Vol. 38, 
1954, pp. 475-481. 

CHAPMAN, L. J. AND D. T. CAMPBELL. The Effect of Acquies- 
cence Response-set upon Relationships among the F 
Scale, Ethnocentrism, and Intelligence. Sociometry, Vol. 
22, 1959, pp. 153-161. 

Coucn, A. AND K. KENIsTON. Yeasayers and Naysayers: Agree- 
ing Response Set as a Personality Variable. Journal of 
Abnormal and Social Psychology, Vol. 60, No. 2, March 
1960, pp. 151-174. 

Cronsacu, L. J. Response Sets and Test Validity. Educational 
and Psychological Measurement, Vol. 6, 1946, pp. 475-494. 

CronBacu, L, J. Further Evidence on Response Sets and Test 
Design. Educational and Psychological Measurement, Vol. 
10, No. 1, 1950, pp. 3-31. 

Gateg, E. L. AnD B. M. Bass. Regional Differences in Inter- 
relations among Authoritarianism, Acquiescence, and 
Ethnocentrism. Journal of Social Psychology, Vol. 49, No. 
1, 1959, pp. 47-51. 


Hare, A. P. Interview Responses: gon or Conformity: 
— Opinion Quarterly, Vol. 24, No. 4, 1960, Pp 679. 


a S. R. Some Considerations Relative to Nondire: 
tive Counseling as Therapy. Journal of Clinical Psychol. 
ogy, Vol. 4, 1948, pp. 226-231. 

Jackson, D. N. AND S. Messick. Content and Style in Per. 
sonality Assessment. Psychological Bulletin, Vol. 55, No, 
4, July 1958, pp. 243-252. 

Jounston, A. M. The Relationship of Various Factors to 
Autocratic and Democratic Classroom Practices. U npub- 
lished doctoral dissertation, University of Chicago, 1948 

KAssepaum, G. G., A. S. Couch AND P. E. SLATER. The Fac. 
torial Dimensions of the MMPI. Journal of Consulting 
Psychology, Vol. 23, 1959, pp. 226-236. 

Lentz, T. F. Acquiescence as a Factor in the Measurement of 
Personality. Psychological Bulletin, Vol. 25, 1938, p. 659, 
(Abstract) 

Lewis, N. A. AND J. A. TAyLor. Anxiety and Extreme Re. 
sponse Preferences. Educational and Psychological Meas- 
urement, Vol. 15, 1955, pp. 111-116. 

Lorcr, I. Gen-like: Halo or Reality? Psychological Bulletin, 
Vol. 34, 1937, pp. 545-546. (Abstract) 

MERSMAN, I. Personality Traits as Related to Vocational 
Choice. Unpublished masters’ thesis, University of Chi- 
cago, 1948. 

Puiwip, B. R. Generalization and Central Tendency in the 
Discrimination of a Series of Stimuli. Canadian Journal 
of Psychology, Vol. 1, 1947, pp. 196-204. 

Rusin, H. K. A Constant Error in the Seashore Test of Pitch 
Discrimination. Unpublished masters’ thesis, University 
of Wisconsin, 1940. 

Runpguist, E. A. Response Sets: A Note on Consistency in 
Taking Extreme Positions. Educational and Psychological 
Measurement, Vol. 10, 1950, pp. 97-99. 

SINGER, W. B. AND P. T. Younc. Studies in Affective Reaction: 
III. The Specificity of Affective Reactions. Journal of 
General Psychology, Vol. 24, 1941, pp. 327-341. 

Wesster, H. The Meaning of “Response Set” in Personality 
Inventories. American Psychologist, Vol. 15, No. 7, July 
1960, p. 431. (Abstract) 


The less a science has advanced the more its terminology tends to rest upon an 


uncritical assumption of mutual understanding. With increase of rigor this basis 


is replaced piecemeal by the introduction of definitions. The interrelationships 


recruited for these definitions gain the status of analytic principles; what was 


once regarded as a theory 


about the world becomes reconstrued as a convention 


of language. Thus it is that some flow from the theoretical to the conventional is 


an adjunct of progress in the logical foundations of any science. 
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Ad Recognition and Respondent Set 


VALENTINE APPEL and Mitton L. BLUM 


Marketing, Merchandising and Research, Inc. 


Often an ad is evaluated in terms of the proportion of people who © 
claim to have seen it before. The authors find that this claim depends 
not only on the person’s exposure to the ad, but also on the num- 
ber of magazines he reads and his interest in the product advertised. 


COMMON FORM OF EVALUATING advertising effec- 
l \ tiveness in the print media has to do with de- 


urnal offftermining the extent to which the ad in question 
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has been remembered by the reader to whom it was 
exposed. Although many methods have been pro- 
posed for measuring ad remembrance, by far the 
most widely used is the recognition method such as 
the one employed by the Starch Advertisement 
Reading Service. 

The measurement of ad remembrance was con- 
sidered so important to the field of advertising that 
in 1956 it was the subject of a large research under- 
taking supervised by the Advertising Research 
Foundation. One of the more striking findings of 
this Study of Printed Advertising Rating Methods 
(the PARM study) was the fact that measures of ad 
recognition appear insensitive to the passage of 
time. As Lucas (1960, p. 15) pointed out in his 
evaluation of the PARM study, this “poses a pro- 
found question. Recognition is a type of memory, 
and memory should decline with the passage of 
time.” This observation was made by the authors 
independently of the Lucas article and was in fact, 
the stimulus which encouraged the present study.1 

The authors reasoned that since the conventional 
ad recognition methods do not conform to what 


*The study was sponsored by Life magazine in line with 


y their continued interest in advertising research. The authors 


Wish to thank Dr. Richard H. Ostheimer, Life Director of 
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their continued suggestions and encouragement in the com- 
pletion of this exploratory research. 
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might be expected either on the basis of learning 
theory or common sense, there was reason to believe 
that a substantial portion of the variance among 
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advertisements of such recognition measures is attrib- 
utable to factors other than exposure of a par- 
ticular ad in a particular magazine issue. The major 
purpose of the present study was to subject this 
hypothesis to experimental test, and to identify 
some of the factors, not related to actual exposure, 
which contribute to the between-ad variance. 

The ad recognition measure which was employed 
for the purpose of the present research was the 
noted score, defined by the Starch Magazine Ad- 
vertisement Reading Service (1955) as “...a 
measure of power of an advertisement to secure the 
attention of readers. It is the percent of readers of 
the current issue who remembered, when inter- 
viewed, that they had previously seen the advertise- 
ment in the particular publication.” 

On the basis of a previous pilot study conducted 
by the authors, a number of hypotheses were formu- 
lated concerning the non-recognition factors which 
contribute to ad noting. The present research is 
specifically designed to test these hypotheses: 


1. The noted score is not an uncontaminated 
measure of ad recognition. Only part of the 
noted score variance is attributable to ad 
recognition per se. The remaining variance is 
vattributable to such factors as consumer in- 
terest in the product advertised, nearness to 
purchase of the brand advertised, etc. 

2. There are certain consumers who have a 
higher tendency to note ads and there are 
other consumers who have a lesser tendency 
toward ad noting, regardless of whether they 
were actually exposed to the ad. 

3. This noting set, or tendency to note ads, is re- 
lated to multi-magazine readership. Multi- 
magazine readers are more likely to note ads 
falsely than non-multi-magazine readers. 


METHOD 
Research Design 


The research design involved interviewing two 
matched samples of Life readers in Life-subscribing 
households, using the February 15, 1960 issue. One 
sample was qualified as having been previously ex- 
posed to the test issue; the other sample had not 
been exposed to the issue prior to the interview. 

The selection of these two matched samples was 
accomplished as follows. The February 15 issue of 
Life was scheduled to arrive in subscribers’ house- 
holds on Thursday and Friday, February 11 and 12. 
Accordingly 492 interviews were conducted on two 
successive Wednesdays: 197 on February 10, prior 
to arrival of the test issue, and 295 on February 17, 
about a week after arrival of the test issue. 
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Respondents interviewed February 10, before they 
could have been exposed to the February 15 issue, 
were qualified as Life readers using the cover of the 
then current February 8 issue. Those interviewed 
February 17, after exposure to the issue, were qual. 
ified using the then current February 15 cover. In 
addition to being qualified as Life readers, all re. 
spondents were qualified with the February 2 and 
February 16 issues of Look, and the January 30 
and February 6 issues of the Saturday Evening Post. 
Those respondents interviewed February 17 were 
also qualified using the February 13 issue of the 
Saturday Evening Post. 

All test copies of the magazine were prepared in 


New York, using the first available copies of the 


issue. These test copies were prepared February 8 
in the authors’ office. The same evening, test copies 
were shipped to the interviewing areas in the fol- 
lowing condition: 


1. The cover pages and the four-fold spread con-f 


taining the table of contents had been re. 
moved, 

2. Sixteen additional full page ads had been in- 
serted at different points in the issue selected 
so as not to interfere with the editorial flow. 
These sixteen ads, representing western re. 
gional brands, all had been run in previous 
Life western editions, but none had ever ap- 
peared in the interviewing areas. Hereafter 
these 16 ads will be referred to as the bogus ads. 


All respondents were female, aged 18 to 70, in 
households selected at random from Life subscrip- 
tion lists, geographically clustered in Atlanta, Chi- 
cago, Hartford, and Queens and Westchester coun- 


ties, N. Y. The interviewer assignments were such f 


that within each of the interviewing areas the sam- 
pling clusters were randomly divided in two, with 
half the clusters being assigned February 10 and 
the other half February 17. 





Questionnaire 
The questionnaire had four parts: 


1. The first part concerned itself with the quali- 
fication of respondents as current issue readers 
using the aforementioned Life, Look and Pos! 
covers. For each cover, the respondent was 
asked: “I would like you to tell me whether 
you have looked into or read any part of this 
issue.” 

. The second part of the interview consisted of 
the presentation of the ads within the context 
of the magazine. In addition to the 16 bogus § 
ads, a total of 53 half-page or larger ads, ap- 
pearing in the test issue, were shown to all 
respondents. 
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The February 10 respondents were asked the 
following question about each ad: “I would 
like to know about the advertisements that 
you definitely remember having seen or read 
before, either in this magazine or in some 
other.” This wording was used in order to 
minimize the number of discontinued inter- 
views by respondents who might have been 
aware that they had never seen the particular 
issue being used in spite of the fact that the 
cover and table of contents pages had been re- 
moved. On February 17 the question read: 

. definitely remember having seen or read 
in this issue.” 

The ads were numbered consecutively from 1 
to 74. (There were an additional five split-run 
ads in the test issue, which were excluded from 
the analysis.) However, in order to minimize 
any fatigue effects that might have developed 
in relation to the ads shown near the end of 
the magazine, the starting points were system- 
atically varied. Ads numbered 1, 21, 41 and 
61, were the four different starting points used. 

. The third part of the questionnaire contained 
a list of 40 different brands which were ex- 
pected to be included in the test issue. For 
each brand, the respondent was asked to clas- 
sify herself as having: bought this brand 
within the past month; or not bought this 
brand within the past month, but intending 
to buy it within the next month; or neither 
bought this brand this past month nor plan- 
ning to buy it next month. 

For products purchased infrequently, such as 
durables, the period of time used was a year 
rather than a month. 

. The fourth part asked, for each of 30 product 
categories, whether the respondent found ad- 
vertisements about the product class to be 
very interesting, somewhat interesting, or not 
at all interesting. 


RESULTS 


The Present Method Versus the Standard Rating Service 


Since the method employed here differs in cer- 
tain respects from the recognition method employed 
by the standard ad recognition rating service, it is 
necessary to establish whether the two methods 
measure essentially the same kind of behavior. The 
degree of similarity between the scores reported by 
the rating service and those generated by the au- 
thors’ method can be seen from the .90 Pearson 
product moment correlation coefficient between the 
tating service noted scores for women, and the au- 
thors’ scores after exposure to the test issue. This 
correlation indicates that the two methods measure 
essentially the same aspect of behavior. However, 
a may be seen in Table 1, the two sets of scores 
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differ in that the authors’ scores after exposure, al- 
though equally variable, were on the average 22 
percentage points higher than were the rating serv- 
ice scores. 


TABLE 1 
COMPARISON OF AUTHORS’ WITH RATING SERVICE 
NOTED SCORES (N = 53 ADS) 
Authors Rating Service 
26.0% 
16.8% 


Mean Noted Score 47.7% 


Standard Deviation 16.7% 


A number of possible differences between the au- 
thors’ method and the rating service method may 
account for this difference in absolute score level, 
although no one of these methodological differences 
can be directly supported as the major reason ac- 
counting for the observed statistical difference. One 
factor which distinguishes the two methods, but 
which does not explain the difference, is the fact 
that the authors’ sample included only subscribing 
households, whereas the rating service procedure 
may also include newsstand buyers and pass-along 
readers. The PARM study (Vol. IV, Bulletin 7) 
clearly indicated that recognition scores do not 
vary substantially as a function of how the copy 
was obtained. 

It also appears unlikely that the difference in 
mean noted score of the size reported is attributable 
to the fact that the authors’ sample was concen- 
trated in a small number of large metropolitan 
areas whereas the rating service’s sample was more 
geographically dispersed. Although sample design 
may account for a small part of the difference, it 
cannot account for all of the 22 percentage point 
spread (PARM, Vol. IV, Bulletin 2). 

Another factor which may account for part of 
the difference is the fact that the interview, as con- 
ducted by the authors, was shorter than the one 
conducted by the rating service. The reason for the 
shorter interview is attributable to the fact that the 
rating service obtains additional information for 
those ads which the respondent claims to have 
noted: “‘Seen-Associated—which measures those read- 
ers who had seen or read the advertisement suffi- 
ciently to know the product or advertiser; Read 
Most—which measures those readers who read 50% 
or more of the reading matter of the advertisement; 
Determination of the observation and reading of 
component parts—headline, picture(s), sub-head- 
ings, text units and signature” (Starch, 1946, pp. 
3-4). 
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The present interview obtained data pertaining 
to noting only. The depressant effects of length of 
interview upon noted score have been well docu- 
mented by Franzen (1942), Starch (1946), and more 
recently PARM (ARF, 1956). It should be pointed 
out, however, that the size of the difference between 
the authors’ and the rating service’s noted scores is 
too large to be explained solely in terms of respond- 
ent boredom or fatigue. Undoubtedly the difference 
is caused by a number of different factors operating 
simultaneously. 

The important point is that two basically similar 
methods (the correlation between them being .90) 
produced drastically different mean noted scores. 


The Noting Set 


A major hypothesis which the research was de- 
signed to test was that certain individuals tend to 
claim recognition of magazine advertisements and 
others tend to deny having seen such advertise- 
ments, regardless of whether they were actually ex- 
posed to them. For any given individual, therefore, 
it was hypothesized that it would be possible to 
predict the number of ads that would be noted after 
exposure on the basis of the number of bogus ads 
noted. The tendency to note or not to note we have 
labeled a noting set. 

Table 2 confirms the noting set hypothesis. 
Among those who, after exposure to the test issue, 
noted no bogus ads, the mean ad noting was 28 per 
cent. Among those who noted eight or more bogus 
ads, the mean ad noting was 75 per cent. A similar 
pattern applies before test issue exposure. From 
this, it follows that regardless of actual exposure, 
certain individuals are likely to produce higher 
noted scores than are others. 


TABLE 2 
BOGUS AD RECOGNITION RELATED TO 
AD NOTING FOR 53 REAL ADS 


After 
Test Issue Exposure 


Before 
Test Issue Exposure 


Number Number Number 
of Bogus of Re- Mean Real of Re- Mean Real 
Ads Noted spondents Ad Noting spondents Ad Noting 
Eight or more 54 13% 8 66% 
Three to seven 89 53% 44 43% 
One or two 90 38% 76 30% 
None 62 28% 69 19% 


Table 3 confirms the hypothesis that one of the 
factors related to the noting set is multi-magazine 
readership. Life readers who claimed also to have 
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read one or more of the recent issues of Look aud/o 
the Post upon which they were qualified noted 
more bogus ads than did those who read Life only. 
A greater disposition toward noting, whether the 
ads be real or bogus, appears to be a characteristic 
associated with multi-magazine readership. 


TABLE 3 


MULTI-MAGAZINE READERSHIP RELATED TO AD 
NOTING FOR 16 BOGUS ADS 


Before Test 

After Test Issue Ex posure 
Issue Exposure Mean # 

Number Mean Number Bogus 

Other Cover of Re- Bogus of Re- Ad 
Qualifications spondents Ad Noting spondents Noting 
Both Look and Post 40 36% 38 18%, 
Either Look or Post 127 23% 76 11%, 
Neither Look nor Post 128 21% 83 8%, 


The question naturally arises as to whether the 
data of Tables 2 and 3 reflect merely the same type 
of distortion: false noting of ads in Table 2 and 
false reading of issues in Table 3. The available 
evidence contradicts this interpretation. From the 
PARM study (Vol. IV, Bulletin 2), it is apparent 
that whether respondents are qualified by mere 
cover recognition, or whether they are also re 
quired to describe one of the articles in the issue 
(thereby verifying their exposure), makes little dif: 
ference in the mean noted score. This finding was 
further verified by a small pilot study conducted by 
the authors in which respondents who were able to 
qualify as readers only on the basis of cover recog: 
nition (with all identifying slugs removed) were 
compared with respondents who, in addition to 
claiming readership, also were able to describe one 
or more articles. There were no significant differ 
ences between these two groups in terms of ad not- 
ing. The evidence, therefore, argues against false 
claiming of readership accounting for the relation- 
ship between bogus ad noting and multi-magazine 
readership. 


Confounding Variables 


To test the hypothesis that the noted score afte! 
exposure is not purely a measure of ad recognition, 
the Pearson product moment correlation coefficient 
was computed between data gathered prior to €x- 
posure and the data gathered after exposure to the 


test issue. The correlation between these two sets 0! 


data was .72. It is clear from this that the noted 
score is highly predictable in advance of issue ex 
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posure, i.e., before the ad has ever been run. This 
may be interpreted to mean that about half (52%) 
the noted score variance following exposure can be 
attributed to factors other than recognition of a 
particular ad in a particular magazine issue. It be- 
comes apparent, therefore, that any comparison be- 
tween ads simply on the basis of their noted scores 
is likely to prove faulty. 

Three non-recognition variables have been iden- 
tied which contribute te spurious ad recognition. 
These are: 


1. Apparent respondent inferences attributable 
to familiarity or unfamiliarity with the edi- 
torial content of the issue, 

. Degree of interest in the product advertised, 


and 
3, Nearness to purchase of the brand advertised. 


Respondent Inference. As expected, the noted 
scores after exposure were higher than they were be- 
fore exposure for the 53 real ads. But the noted 
scores after exposure were also significantly higher 
for the bogus ads. Table 4 shows that the mean 
noted score on the bogus ads increases from 12 per 
cent prior to exposure to 25 per cent following ex- 
posure to the test issue. Part of the increase in the 
noted score for the real ads seems attributable not 
to recognition, but rather to some sort of respond- 
ent inference, either conscious or unconscious. Ap- 
parently the exposed readers generalize their famil- 
iarity with the editorial material to the advertising. 
Their inference would appear to be: “I’ve read the 
issue, so I must also have seen the advertisement.” 


TABLE 4 


AD RECOGNITION BEFORE AND AFTER 
ISSUE EXPOSURE 


Number 
of Re- 
Mean Noted Scores spondents 16 Bogus Ads 53 Real Ads 
After test issue exposure 295 25% 48% 


Before test issue exposure 197 12% 31% 


Degree of Product Interest. For each of 30 dif- 
ferent product categories, respondents were asked 
whether they: “. . . personally find the advertise- 
ments about the product to be: very interesting, 
somewhat interesting or not at all interesting.” 

Of the 53 ads, 36 could be classified as falling 
into one of the product categories for which such 
ad interest data were available. Both before and 
alter exposure to the ad, those respondents claiming 
to be very interested in the ads of the product class 
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gave higher noted scores than those claiming to be 
not at all interested. ‘These data are summarized in 
Table 5. 


TABLE 5 


PRODUCT INTEREST RELATED TO AD NOTING 
FOR 36 REAL ADS 
Interest in Ads of Product Class 


Very Somewhat Not at All 
Mean Noted Scores Interested Interested Interested 


After test issue exposure 61% 48% 36% 
Before test issue exposure 40% 33% 22% 


These differences in ad noting as a function of 
product interest do not reflect merely the fact that 
people are more likely to read ads in which they are 
interested, nor merely the probability that the same 
or similar ads were seen before in other issues or in 
other magazines. Data were available on ad interest 
for 13 of the 16 bogus ads. The noted scores for 
these ads, presented in Table 6 as a function of ad 
interest, clearly indicate a rise in bogus ad noting 
with increasing interest in ads of the product class. 


TABLE 6 
PRODUCT INTEREST RELATED TO AD NOTING 
FOR 13 BOGUS ADS 
Interest in Ads of Product Class 


Very Somewhat Not at All 
Mean Noted Scores Interested Interested Interested 


After test issue exposure 38% 29% 18% 
Before test issue exposure 13% 12% 1% 


In this case, the likelihood that respondents 
might have previously seen the same or similar ads 
was quite remote, because the brands involved 
were relatively unknown outside of the western 
United States. It follows, therefore, that regardless 
of actual exposure, respondents with high interest 
in a product category are more likely to claim not- 
ing the ads of that product category than are those 
who are less interested. 

Nearness To Purchase. We may presume that a 
similar relationship would apply as a function of 
nearness to purchase, although it is not possible to 
test this hypothesis using bogus ads, since they rep- 
resent brands not available for purchase. 

Data concerning purchase behavior were avail- 
able, however, for 25 of the 53 brands for which 
noted scores on the real ads were obtained. For each 
of these 25 brands, each respondent was asked 
whether she had bought that brand within the last 
month and, if not, whether she intended to buy it 
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in the next month. In the case of less frequently 
purchased items, the time period was extended to 
a year. 

Those claiming they had recently bought or were 
about to buy the brand in question were classified 
as being near purchase. All others were classified as 
not being near purchase. Table 7 shows that re- 
spondents claiming to be near purchase produced 
higher noted scores than those claiming not to be 
near purchase. 


TABLE 7 


NEARNESS TO PURCHASE RELATED TO AD 
NOTING FOR 25 REAL ADS 


Nearness to Purchase 


Mean Noted Scores Near Not Near 
After test issue exposure a7%, 47% 
Before test issue exposure 44%, 30% 


Both before and after exposure to the test issue 
consumers tend to note the ads of brands which 
they have recently purchased, or believe that they 
are about to purchase, to a greater extent than the 
ads of brands which they have not. ‘The implication 
is that brands with a larger share of market, other 
things being equal, are likely to obtain higher 
noted scores than are brands with a smal'er market 
share. 

These data raise a question concerning the ex- 
tent to which the often demonstrated differences in 
ad noting between the buyers and non-buyers of a 
particular brand proves that the advertising has 
resulted in increased sales (e.g., Starch, 1958). There 
is evidence that the reverse is true, at least in part. 
As measured by differences in the noted score be- 
tween buyers and non-buyers prior to issue ex- 
posure, increased sales appear also to result in in- 
creased “advertising effectiveness!’’? 


A New Measure of Ad Recognition 


Because of these confounding factors, measures 
of ad noting are equivocal at best. The problem be- 
comes one of developing a measure which is inde- 
pendent of the influences of the spurious non-recog- 
nition factors which are related to the noted score, 
but which have little to do with real recognition. 

Noted scores after exposure were predictable 
prior to exposure with a correlation of .72. This 


1 There is reason, however, to question whether these find- 
ings support Festinger’s (1960) conclusion that such increased 
ad recognition among buyers relative to non-buyers reflects 
a form of self-justification of the purchase. The fact that 
these differences persist even in advance of issue exposure 
questions the validity of this interpretation. 
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indicates that it is possible to isolate and compen. 


sate for these confounding influences to the exten§ 


that they exert a differential influence upon differ. 
ent ads. Certain ads, because of their resemblance 
to other ads, the market share of the brand being 
advertised, or consumer interest in the produc 
class, are at an unfair advantage or disadvantage 
relative to other ads. 


The Regression Method 


The effect of such factors can be compensated for 
by employing pre-exposure noted scores to predict 
the scores after exposure. The measure of recogni: 


tion is the extent to which the noted score after ex. 


posure exceeds or falls short of what one might 
expect on the basis of the score obtained prior to 
exposure. 

Figure 1 shows the regression of pre-exposure 
scores on post-exposure scores of the 53 real ads. 


The vertical axis represents the noted scores for 


readers after exposure. The horizontal axis repre. 
sents the scores of readers before exposure. ‘The re. 


gression line represents the best linear prediction § 


of the exposed readers’ scores from the scores of the 
non-exposed readers. 

The deviation of the exposed readers’ scores from 
the predicted scores for each ad can be calculated. 
The greater the deviation or residual above the re- 
gression line, the greater is the effect attributable to 
actual issue exposure. This is not to say that devia- 
tions below the regression line do not represent 
some sort of remembrance. The regression line rep- 
resents merely the most likely score after exposure, 
based upon the score prior to exposure. As such, 
the residual scores are interpretable only in relative 
terms, i.e., one ad relative to another. There is no 
absolute zero point.? 

Because of the sample sizes involved in the cal- 
culation of the noted scores, relatively little of the 


residual variance can be attributed to unreliability F 


of measurement. The discrepancy, therefore, be- 
tween the predicted and the actual score must be 
attributed to the effects of exposure. 


?In this respect, the regression method is different from 
the Lucas (1940) method for controlling for confusion which 
is also based upon pre-exposure noted scores. Although the 
two methods have obvious similarities, they differ to the ex- 
tent that Lucas was attempting to develop an absolute recog: 
nition measure by reducing the after exposure noted score as 
a function of those respondents claiming false recognition 1n 
advance of exposure. An assumption inherent in the Lucas 
method, which is not necessary in the regression method, is 
that the only type of confusion has to do with false noting. 
No provision is made by Lucas for the possibility of fals 
denial which may spuriously lower the noted scores for some 
ads, e.g. a lingerie ad among a population of male readers. 
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Figure 1: REGRESSION OF POST- ON PRE-EXPOSURE NOTING SCORES 
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Color and Size 

The regression method is well illustrated when 
we analyze the effects of color and size upon ad 
recognition. The same data which were presented 
graphically in the accompanying chart are organ- 
ized in tabular form in Table 8. This table presents 
the 53 real ads separated into two groups: 20 half- 
page or full-page black and white ads, 19 of which 
lie below the regression line, and 33 full-page two 
color or four color ads, 24 of which lie above the 
regression line. 
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These crude categories are made necessary by 
virtue of the relatively small number of ads in- 
volved. The noted scores for these ads are pre- 
sented on the left of the table and the residual 
scores on the right. By either measure, the full-page 
color ads can be seen to obtain higher mean scores 
than the half-page or black and white ads. 

These findings are well in line with those of oth- 
ers who have published in this field, including 
Franzen (1942) and Twedt (1952), as well as Starch 
(1946). There is evidence to indicate that these 
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TABLE 8 


SIZE AND COLOR RECOGNITION EVALUATED BY TWO DIFFERENT METHODS 


Noted Score Method 
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writers, because they did not take into account the 
various factors which contribute spuriously to the 
noted score, have probably understated the real in- 
fluence which size and color can exert upon ad 
recognition. Inspection of Table 8 indicates much 
less oVerlap in the residual score distributions than 
in the noted score distributions. Presenting the 
same data in more statistical and capsule form, the 
point biserial correlation between the size and 
color dichotomy and the recognition score is .74 in 
the case of the residual scores, and .48 in the case of 
the noted scores. 

The inference to be drawn is that the regression 
method, because the noted score is so strongly con- 
founded by non-recognition factors, shows the ef- 
fects of size and color to be stronger than does the 
simple noted score. 


CONCLUSIONS 


First, comparisons of the noted scores of identical 
ads in different magazines appear questionable, 
particularly when one magazine has a larger audi- 
ence than the other. The magazine with the smaller 
audience will in general have a larger proportion 
of multi-magazine readers than the magazine with 
the larger audience. Since we have established that 
multi-magazine readers, by virtue of their noting 
set, are more likely falsely to claim noting of ads 
than are non-multi-magazine readers, it follows that 
the smaller audience magazine tends to be credited 
with higher noted scores on the average than does 
the larger audience magazine. In addition, the 
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Residual Score Method 
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smaller audience magazine will benefit by actual 
multiple ad exposure to a greater extent than will 
the magazine with the larger audience. Although 
not directly pertinent to this paper, the authors do 
have evidence to indicate that multiple ad exposure 
increases the noted score over and above what one 
might expect purely on the basis of noting set. 

The futility of attempting to develop some sort 
of correction device which would compensate for 
the various factors which are recognized to con- 
found the noted score is also evident. The ultimate 
objective of this sort of corrective device, such as 
have been proposed by Lucas (1940), Moran (1951) 
and others, is to develop an absolute measure of 
recognition which is interpretable as an absolute. 
A score of 32 per cent for example, should mean that 
32 per cent of respondents were physically exposed 
to and can recognize the ad. The fact that two meth- 
ods as similar as those of the authors and those of the 
rating service can agrec. so closely (r = .90) in rela- 
tive terms, and disagree so completely in absolute 
terms (48 to 26 per cent), belies the wisdom ol 
searching for this type of solution. These same ads, 
were they measured by the Gallup and Robinson 
aided recall method (ARF, 1956), would probably 
have produced scores of the order of 3 to 5 per 
cent. If measured by the Ule (1958) method, the 
scores would have been of the order of 70 per cent. 
And if measured in terms of the respondents’ 
claimed page exposure, as was done by Time, Inc. 
(1959) using Life magazine, scores would have been 
of the order of 90 per cent. 
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Obviously, the absolute magnitude of the ad rat- 
ing can readily be influenced by the method em- 
loyed. A method using minimal prompting such as 
aided recall is bound to produce lower scores than 
one which uses a greater amount of prompting such 
as the recognition method. One which motivates the 
respondent to claim recognition, as in the Ule 
method, will produce still higher scores. 

There is no absolute measure of ad remembrance. 
The best that can be hoped for is a device which 
will allow the unbiased evaluation of one ad 
against a population of other ads, or as many sub- 
populations as may be indicated. Such sub-popula- 
tions might have to do with size, color, product 
class, media, etc. 

The simple noted score, because of the fact that 
it is confounded with many factors completely un- 
related to exposure of a particular ad in a particu- 
lar issue, appears to be severely limited for this 
purpose. Ads for high-interest products, for exam- 
ple, are likely to generate more false noting than 
are ads for low interest products. ‘The same appears 


Fto be true—other things being equal—of ads for 


brands having a high share of market ‘relative to 


brands having a lower share of market. 


The regression method overcomes most, if not 
all, of these problems while generating a recogni- 
tion score which is free of the confounding influ- 
ences which are unrelated to exposure and which 
are measurable in advance of exposure to the mag- 
azine. As such, it points the way toward developing 
a measure which would allow for a valid relative 
comparison between different printed advertise- 
ments. 

One difficulty which must be faced in connection 
with the regression method, or any other method 
which makes use of a pre-exposure score, is that a 
campaign which may effectively use the same or 


similar ads repeatedly is certain to be at a disad- 
vantage relative to one which varies its copy ap- 
proach from ad to ad. A high pre-exposure score 
and a corresponding low residual score may really 
signal an effective and well remembered campaign, 
though the effect of any given insertion upon recog- 
nition may be negligible. The point is that the ef- 
fectiveness of a campaign cannot be measured solely 
in terms of the recognition of a particular ad in a 
particular magazine issue. Measures of ad recogni- 
tion, however valid and unbiased, can only be 
evaluated in relation to the other factors known to 
be relevant to the advertiser’s objective. 
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It ain’t so much the things we don’t know that get us in trouble. It’s the things 
we know that ain’t so. 
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Some Correlates of 


Coffee and Cleanser Brand Shares 
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From an analysis of purchases, prices, preferences and promotion 
of two everyday products, Dr. Banks is able to find some factors 
likely to be important in explaining a brand’s share of the market. 


THEORY OF MARKET DEMAND for brands must 
A consider two major elements: first, the choice 
process within the mind of the consumer; and sec- 
ond, the marketing environment in which purchase 
takes place. This paper describes a model of mar- 
ket demand for brands of convenience goods and 
reports the results of a test of this model. 

All discussion of demand in this paper is in terms 
of ratios, i.e., a brand’s share of the market. If one 
attempted to deal with demand in an absolute 
sense, these ratios would have to be multiplied by a 
base which would consider such factors as the im- 
portance of the product in consumers’ budgets and 
the level of national income. This is a task far 
greater than seems desirable at the moment, and 
one which is not necessarily required for realism. 
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Many businesses consider primary demand trends to 
be out of their control, and evaluate their relative 
success in terms of selective demand position. 

The general demand model may be written: 


P; — f. (Ay, Ag,. . >) oe a (By, Bo,. . -) oe 
Gis M,....)-4-4, Me Banc 
W M 


i i 


where P, is a brand’s share of the market. Market 
share is taken to mean a brand’s share of the total 
volume of sales of the given product class in a cer- 
tain geographical area. 

The terms on the right of the equation are of two 
types. The first term (C,) deals with consumer evalu- 
ation of the intrinsic attributes of a brand, and the 
remainder with the marketing efforts of the com- 
ponent elements of the channel of distribution: R 
for retailer, W fer wholesaler and M for manuv- 
facturer. 

The A’s in the consumer term of the above equa- 
tion are criteria by which consumers evaluate the 
intrinsic qualities of various brands of a given type 
of merchandise. For coffee, these qualities might 
include flavor, flavor consistency, bouquet, type of 
grind, and size and type of package. These criteria 
will differ from person to person in number and 
importance. Furthermore, since judgment is sub- 
jective, individuals with identical criteria may have 
different evaluations of a given brand. The evalua- 
tion of each brand on all criteria considered by 4 
consumer leads to a consolidated judgment of that 
brand at that time. 
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This mental evaluation of brands by a consumer 
can be visualized as an archipelago with some peaks 
rising out of a sea while others are visible below 
the surface. Sea level corresponds to a level of ac- 


ceptability for brands of a given product. The peaks 


represent scalar evaluations of the qualities of the 
various brands at a given time. Brands are con- 
sidered acceptable in the sense that, by their 
intrinsic qualities alone, they would be considered 
as possible purchases. For example some brands of 
coffee may not be acceptable because they have too 
mild or too strong a flavor or do not come in the 
desired grind. 

But the above picture holds only temporarily. As 
time passes, brands may lose their acceptability, 
either because their qualities have actually de- 
teriorated or because other brands. have been im- 
proved. Brands previously unacceptable may rise to 
acceptability by product improvement. A scouring 
cleanser, for example, which was changed from an 


| abrasive to a detergent increased sales considerably. 


Then too, the level of acceptability is subject to 
change. In times of shortage, consumers take almost 
any brand. But in a buyer’s market, they will not 
accept substitutes for favored brands. 

A purchase is made from among the acceptable 
brands but is not mechanically determined by an 
evaluation of value, either ordinal or cardinal. An 
acceptable product may cease to be bought because 
the customer who used it previously desires a 


| change for change’s sake. This satiation phenome- 
§ non appears to be random and is relevant only for 


individual decision; its effect probably washes out 
in groups (Banks, 1950). 

The number of brands considered depends in 
part upon the extent of the consumer’s experience 
and in part upon the nature of the product. The 
more experienced the consumer, the more brands 
he knows, but limits are imposed by attention and 
memory. Generally the consumer is more familiar 
with convenience goods than with shopping goods. 
In the case of shopping goods like appliances, a 
complete picture of brands is seldom available—the 
consumer shops not only to learn which brands are 
for sale, but often to discover the criteria by which 
he might evaluate the brands he has discovered. 

The term of the demand equation starting with 
R represents selling effort by retailers for each 
brand considered. The B’s represent their perform- 
ance of activities like special displays, demonstra- 
tions, recommendations to consumers, services ren- 
dered (large stock, credit and repair), return 
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privileges, etc., on the brands he carries. The next 
term deals with efforts of wholesalers to push differ- 
ent brands, training courses for retailer salesmen, 
demonstrations, special price or credit concessions 
and so on. Finally, we have the term which repre- 
sents the selling efforts of the manufacturers for 
each of their brands. 

The rather simple assumptions implied by the 
form of equation used are not really satisfactory in 
representing the effect of a manufacturer's sales 
efforts. A sound marketing program calls for work- 
ing at several levels simultaneously. Manufacturers 
merchandise new consumer advertising campaigns 
to wholesalers and retailers; retailers are affected by 
advertising campaigns addressed to the general 
public; price changes affect margins throughout the 
channel of distribution. The manufacturer’s selling 
efforts and those of his wholesalers and retailers are 
often closely related. Because of this, the plus signs 
in the equation should be interpreted as general 
logical conjunctions rather than as arithmetical 
additions. Possibly multiplication signs would repre- 
sent reality more closely. 

Customers, retailers, wholesalers and manufac- 
turers vary greatly in scope of activity and our 
equation must not be interpreted as giving equal - 
weight to each of the terms. Formally, the f,, f,, f, 
and f,, in the equation represent quite general func- 
tions of the factors inside the brackets. 


METHOD 

Two research techniques which are often used to 
determine the effect of marketing variables upon 
sales of brands are experimentation and regression 
analysis. In the first, the researcher controls the way 
in which the independent variables affect his test 
units. One example might be a sales test of a new 
package design versus an old, each package being 
used in a comparable sample of stores. 

In the regression procedure, the researcher as- 
sumes a simple relationship between the market 
share of a brand and prices, promotional efforts, 
point of purchase advertising, etc., for it. The as- 
sumption is that the market share of a brand can be 
expressed as an equation, usually linear, which is 
of the form: 


Brand share = a (price of the brand) 
+ b (consumer’s preference rating) + etc. 


By mathematical techniques we choose values of 
the coefficients a, b, c, etc., which best fit the ob- 
served facts. The researcher cannot control factors 
such as preference ratings for all the brands, but 
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his mathematical procedures enable him to estimate 
the effect of each while the effects of the others are 
accounted for statistically. The regression procedure 
has administrative advantages, but will not yield 
the functional relationship among the variables 
studied. 

The user of regression analysis assumes that: 


1. The values of the independent variables are 
fixed and may be looked upon as population 
parameters. Often particular values are de- 
liberately chosen. 

. For a given set of values of the independent 
variables, the resulting values of the dependent 
variable are normally distributed. 

. That the sample be drawn by a process of 
random selection (Anderson and Bancroft, 
1951). 


The research situation which we shall discuss has 


an additional complication in that all variables, 
both independent and dependent, are subject to 
error. Bartlett (1949), and more recently Acton 
(1959), have discussed procedures for dealing with 
this type of situation, but few applications have ap- 
peared in the literature. 

In evaluating the results of tests of significance 


of regression coefficients, caution will be used. In a 
numerical example presented by Bartlett (1949), the 
95 per cent confidence interval of the regression 
coefiicient is 16 per cent larger, assuming both 
variables are subject to error, than when assuming 
the independent variable is stated or measured 
without error. 

Our data were collected in early December 1950 
from 165 Chicago housewives selected by area 
sampling procedures. Blocks were chosen at random 
and four respondents picked at random in each 
block. Purchases were measured about a week be- 
fore information was collected on other variables 
but it was felt that the situation prevailing at the 
time of purchase could not differ materially from 
that a week later. 

For both scouring cleanser and coffee, the in- 
terview covered the respondent's knowledge and 
use of the various brands, preference ratings on the 
brands she knew, and brands, quantities, and place 
of her last purchase. Information was also obtained 
on possibilities for exposure to advertising in terms 
of ownership of a radio or TV set, subscription or 
regular readership of magazines and Chicago news- 
papers. Respondents were classified into high, 
medium and low economic strata on the basis of the 
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1940 rent data for the block in which they lived; 


this last rating was subject to revision by the inter. @ 


viewer after inspection of the household furnishing; 
and equipment. 


The respondents were asked to state their prefer. 


ences for brands of scouring cleanser and coffee by 
means of a thermometer rating device. One-half of 
the respondents made preference statements before 


the question of purchases was raised and the other} 


half made similar preference statements after the 
interviewer had determined the brands on hand. 
The two preference distributions differed insig. 
nificantly; from which we inferred no bias induced 
by the order of questioning and the returns of both 
haives were combined. . 

Only the highest preference ratings made by re- 
spondents were considered. If a respondent placed 
several brands in the highest category she used, each 
brand received an equal fractional share of the 
rating. The sum of these ratings for all respondents 
gave the total number of highest-choice ratings per 
brand. This procedure is described elsewhere 
(Banks, 1950). 

To obtain data on purchases, the interviewer 
asked to see all containers of the last purchase 
under the guise of obtaining code numbers. Only 
when the containers were reported destroyed (e.g., 


- when coffee packed in bags had been put into can- 


isters and the bag discarded) were housewives asked 
to tell what brand they had bought last. Brand 
shares were of total amount bought in last pur 
chases. 

After discussing brand preference and purchase 
with a respondent, it was easy to discover where her 
last purchase of scouring cleanser and coffee had 
been made. The interviewer then went to the 
designated store or stores (not infrequently the 
scouring cleanser and coffee were bought in differ- 
ent stores) as soon as possible after finishing a given 
block assignment of interviews, and for each of the 
brands carried observed the price (in cents per 
package for scouring cleanser, in cents per pound 
for coffee), the amount of stock displayed, and the 
presence of promotional effort and point of pur- 
chase displays. 

The formal model of demand discussed at the 
beginning of this article must be simplified drasti- 
cally for empirical research because it deals with a 
very large number of variables, most of which are 
extremely difficult to measure. The model became, 
after appropriate simplification, one of multiple 
linear regression. The following equation was set 
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up to study the forces affecting market shares of 
brands of scouring cleanser and coffee. 


P; — aX, a bX, oe cXs + dX, oe eX; + [X, + eX, 
where 

p; = Each brand’s share of the sample’s last pur- 
chase of the product. 

X, = Consumer preference in terms of number of 
highest ratings per brand. 

X, = Average price in cents per unit. 

X,; — Store coverage — ; 

No. stores stocking each brand, weighted 
by number shopping these stores 
Total number of users 





X, = Index of stock display = 
1 (no. good shelf displays) +- 
2 (no. special displays) xX 


Total number of ratings 





X; = Index of promotional effort = 
No. stores where brand carried offers 
of price deals or premium 
No. of stores stocking 





X, = Index of Point of Purchase Advertising = 
No. stores where brand had 
POP effort <x 100 


No. of stores stocking 





X, = Dollar expenditure for advertising in the 
three major media (newspaper, radio and 
magazines), Chicago, June through Novem- 
ber 1950. 


a, b, c, d, e, f, g are the regression coefficients 
which were computed mathematically. 


Information on consumer advertising expendi- 
tures was obtained from three sources. A. C. Nielsen 
Company made available (in private correspond- 
ence) radio, newspaper and magazine advertising 
expenditures for brands of scouring cleanser and 
coffee in metropolitan Chicago from June through 
November 1950. This was satisfactory for scouring 
cleanser but gave no information on advertising of 
chains’ private brands of coffee. The Chicago 
Tribune made available unpublished data on total 
advertising expenditures of these chains in Chicago 
in newspapers during this period. Some chains were 
willing to state, also in private correspondence, 
what share of their local advertising budget was 
allocated to their private brands of coffee; for the 
others, a sample of newspapers was selected and the 
ratio of space found to be allocated to their brands 
of coffee was used as the share of their total advertis- 
ing budget allocable to their private brands of 
coffee. 
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RESULTS 

The data were analyzed to determine first, how 
successfully—as measured by the coefficient of multi- 
ple correlation—the research model fitted the actual 
purchase pattern; and second, the relative im- 
portance of the different elements of the model, as 
measured by the size of their regression coefficients. 

First we considered how closely each factor sepa- 
rately was related to brand shares. For this we 
examined the simple correlations. For both coffee 
and scouring cleanser, consumer preference rating 
and store coverage, themselves highly correlated, 
showed the highest simple correlation with market 
shares. For scouring cleanser, promotional effort was 
highly correlated with market shares, while adver- 
tising expenditure was poorly correlated. The 
reverse was true for coffee. For both products, ad- 
vertising expenditure was more highly correlated 
with store coverage than with market share or any 
other variable. 

Table 1 shows the regression coefficients which 
permit direct evaluation of the relative effect of the 
independent variables on the dependent variable, 
brand shares. 


TABLE | 
REGRESSION COEFFICIENTS BETWEEN BRAND 
SHARES AND SEVEN MARKETING ACTIVITIES 


Cleanser (N= 9) Coffee(N = 21) 
(Multiple (Multiple 
R = 999) R = .792) 


Brand Preference (a) .368* 1.108} 
Average Price (b) .436* — .202 
Store Coverage (c) -150* 609 
Stock Display (d) By — .364 
Promotional Effort (e) 416* .067 
POP Advertising (f) 242° — .207 
Advertising Expenditure (g) .143* — 536* 


Marketing Activity 


* Significant at the five per cent level. 
+ Significant at the one per cent level. 
For Cleanser: 
P,= .368X, — .436X, + .150X; + .224X, + 416X, 
— .242X, + .143X, 
For Coffee: 
P, = 1.108X, — .202X, + .609X; — .364X, + .067X; 
— .207X, — .536X, 


For the scouring cleanser equation, all coefficients 
were significant at the five per cent level of confi- 
dence. The most important factors in determining 
market shares of brands were price, promotional 
effort and brand preference. As might be expected, 
price is negatively related, while promotional effort 
and brand preference are positively related to 
market share. One apparent anomaly was that point 
of purchase advertising was negatively related. 
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For coffee, the regression model produced a coeffi- 
cient of multiple correlation of .972, significant at 
the one per cent level of confidence. However, in 
contrast to the scouring cleanser data, only two of 
the marketing factors studied, brand preference and 
advertising expenditure, were found to have signifi- 
cant effects upon the share position of brands of 
coffee, while store coverage approached significance. 

For scouring cleanser it was observed that there 
were relatively high correlations between the brand 
preference ratings and several of the variables 
measuring marketing activity. The question arose— 
need we consider preference at all in such a demand 
equation? 

The question was answered by dropping prefer- 
ence as an independent variable and noting what 
happened to the fit of the regression equation. ‘This 
made little difference: R* dropped from .9997 to 
.9903, a change of less than one per cent. The reason 
for this may be found in the results of the regression 
of these marketing variables on the preference. 
Ninety-three per cent of the variance in preference 
for brands of scouring cleanser was accounted for 
by variance in the six external marketing variables. 
All of the regression coefficients were significant be- 
yond the five per cent level, with those of price, 
promotional effort and stock display being highest. 
Differences among these three were not significant. 

For coffee, on the other hand, dropping the 
preference variable reduced R? from .9456 to .6063, 
a change of 35 percentage points. Although the 
six-variable regression equation without preference 
for various brands of coffee still yielded a statisti- 
cally significant multiple correlation coefficient, it 
is clear that these customers were more sensitive to 
the qualities of coffee brands than to the qualities 
of cleanser brands. 

The linear equation based on six marketing vari- 
ables did a satisfactory job of “explaining” shares 
of brands of scouring cleanser and coffee. However, 
there are advantages in reducing the number of 
variables. Other sets of regression equations were 
developed, using only the three marketing variables 
found to have the strongest relation to brand shares. 

For scouring cleanser, the three used were price, 
store coverage and promotional effort. ‘These three 
variables were quite effective in fitting the data; R? 
dropped from .9903 to .8892, only 10 per cent. 

Because of the ease with which this three-variable 
equation could be computed, it was applied to 
various segments of the total sample. On the basis 
of information collected during the interview, 
respondents could be classified in the following 
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ways: by income group; by whether they were ex. 
posed to much advertising; and by whether they 
shopped mostly at chain or independent stores. 

Income was determined from 1940 Census rent 
data and modified by interviewer’s evaluation of 
homes. To be considered as being “exposed to much 
advertising” they had to be exposed to three adver- 
tising vehicles other than radio programs. Type of 
store usually shopped was determined by question- 
ing. 

In a cross-classification of respondents by income 
level and stores shopped (see Table 2), it was found 
that the low income groups patronized independents 


TABLE 2 


REGRESSION COEFFICIENTS BETWEEN CLEANSER 
BRAND SHARES AND THREE MARKETING 
ACTIVITIES: BY STRATA 


Regression Coefficients 


Weighted Promo- Multiple 
Average Store tional Correlation 
Group Price Coverage Effort Coefficient 
Entire Sample —.291F A91t 603+ 943+ 
Chain Store .083 —.049 1.016+ 977+ 
Independents —.045 .601* 343 .832* 
Adv. Prone —.338+ 587+ 530+ 938+ 
Non-Prone —.037 164 831+ 936+ 
High Income .044 .210 837+ 919+ 
Medium Income .974 130 867+ 935+ 
Low Income 029 162 809+ .909* - 


* Significant at the five per cent level. 
+ Significant at the one per cent level. 


to a much greater degree than did the two upper 
income groups. This was largely because few chains 
have units in the Negro and low income areas. 
There was no clear relationship between income 
and availability to advertising exposure. 

Even when the total sample was split up, the 
regression coefficients for the scouring cleanser data 
remained significant except for respondents shop- 
ping at independent grocery stores. Promotional 
effort and distribution apparently were more in- 
portant than price in “explaining” variations in 
scouring cleanser brand shares for the entire sample 
as well as for its different segments. However, all 
three coefficients were still significant beyond the 
one per cent level. 

Between the various segments of the sample, some 
differences in importance of the three variables did 
emerge. Apparently chain store shoppers were more 
susceptible to promotional offers and deals for 
brands of scouring cleanser than housewives who 
shopped at independent stores; for the latter, avail- 
ability was the most important factor. Those open 
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to advertising exposure were equally affected by the 
store distribution of brands and the use of promo- 
tional effort. . 

Price also had a strong effect. Promotional effort 
was the only variable among the three tested to 
aflect market share among those respondents not 
availab'e to heavy advertising exposure. 

Income seemed to have no effect upon the weights 
of the variables in the demand equation. This seems 
plausible since scouring cleanser is relatively cheap. 
It is interesting to note that disguised price reduc- 
tions—in terms of special deals or offers—had a 
much stronger effect upon scouring cleanser brand 
shares than did actual price differences. This was 
equally true for all income groups. 

A three-variable regression model was also fitted 
to the coffee data using the variables found to have 
highest correlation with market shares: price, store 
coverage and past six months’ advertising expendi- 
ture. 

It was found to yield a statistically significant fit; 
the coefficient of multiple correlation was significant 
at the one per cent level. However, the fit of the 
three-variable equation for coffee was substantially 
poorer than that for scouring cleanser. There are 
at least two reasons for this: the greater diversity of 
marketing patterns among the 21 brands of coffee 
than among the nine brands of scouring cleanser; 
and the greater importance of brand quality for 
coffee than for scouring cleanser. 

For the entire sample, only store coverage had a 
statistically significant relation with market shares 
of coffee brands. Neither price nor advertising ex- 
penditure was found to have a significant effect 
upon market shares when the other two factors were 
held constant. 

The three-variable model was applied to various 
segments of the sample and statistically significant 
fits were obtained among chain store shoppers, those 
advertising prone and those in the high income 
group. Probably these three sub-groupings over- 
lapped so that the same respondents showed up 
under different headings. 

Although the data were not statistically signifi- 
cant, an interesting situation held among the 
“manufacturer's brand” buyers. Among those 
peéple who bought a manufacturer’s brand (Hills 
Brothers, Chase & Sanborn, Stewarts, etc.), advertis- 
ing actually appeared to have a negative relation 
with sales. Examination of the raw data indicated 
that, during the period studied, Maxwell House was 
spending 40 per cent of the total advertising volume 
in Chicago for these six brands, but was receiving 
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only 16 per cent of their total sales. In contrast, 
Manor House was spending only nine per cent of 
the total advertising volume, but receiving almost 
30 per cent of total sales. 


CONCLUSIONS 

The more general model discussed at the begin- 
ning of this paper has illustrative value for teach 
ing purposes. It formulates problems of demand in 
marketing terms by dealing with market share data 
obtained by differentiated brands whose owners 
compete with all the tools in their respective 
arsenals. Students are thus presented with a device 
for considering the major variables affecting sales. 

The demand model easily accommodates the 
familiar discussion of convenience, shopping and 
specialty goods. For example, the demand model 
for convenience goods would likely show store 
coverage, point of purchase display and promo- 
tional effort to be most important in affecting sales. 
On the other hand, for specialty goods preference 
would probably be the only variable of major 
importance. 

The model tested by the data presents more of a 
mixture of values. Such a regression model can 
approximate the importance of various factors 
affecting market shares, and the relationships be- 
tween these factors. Surveys are less helpful on this 
point since people seldom can evaluate the relative 
importance of the factors impinging upon their 
purchase decisions. 

The results of regression analysis should be con- 
sidered as first approximations for several reasons. 
Foremost is the fact that they show only co-varia- 
tion, not cause and effect. The regression analysis 
is useful to point out the factors to be used in ex- 
perimentation, but should not be considered as a 
substitute for it. 

Findings from the regression model hold only for 
the range of observations available in the data. 
Promotional effort was found to have a stronger 
relationship than price with brand shares of scour- 
ing cleanser. But the range of prices was quite 
narrow, 8.6 to 12.9 cents per can. Whoever breaks 
through these limits may well find price to have a 
great effect upon market shares. 

Another caution in the use of the regression 
analysis is that it yields only over-all relationships. 
In any market, some brands are declining in market 
shares, others are rising, while still others are 
merely holding their own. The regression procedure 
gives coefficients which are actually averages of the 
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coefficients for the individual brands. These results 
may not apply to any one brand. 

Marketing strategies usually call for manipulating 
several variables simultaneously. Manufacturers 
merchandise their coming advertising campaigns to 
their retailers, who respond by improving stock 
holdings and displays and by putting up point of 
purchase advertising sent them. Private brands are 
usually offered in only a few stores; but in these 
stores they are usually given the best locations, 
largest stock displays and massive point of purchase 
advertising displays. Manufacturers’ brands, espe- 
cially in convenience goods, tend toward 100 per 
cent coverage but with less prominence of display 
within stores. Regression analysis is not the best 
way to cope with these different relationships be- 
tween several independent variables. 

Finally, regression analysis is a quantitative pro- 
cedure and uses essentially quantitative evaluations 
of data. It is quite likely that many relationships are 
distorted by the units we use to express these 
quantities. Advertising is the most important case in 
point though the problem also arises with premiums 
and deals. If the effect of advertising were pro- 
portional to expenditure on it, then the firm which 
spent the most for advertising would sell the most 
product. However, this does not happen (Borden, 
1942). Advertising effect depends not only upon 
magnitude of expenditure but also upon the moti- 
vating power of the copy and upon the media used. 


Failure of advertising expenditure to correlate with” 


sales does not mean that advertising is ineffective, 
but may mean only that the measuring procedure 
failed to evaluate properly the strengths of various 
campaigns. 

Implicit in the model presented is the assumption 
that all variables act instantaneously. This assump- 
tion is open to serious doubt. An effort was made 


to take differing time lags into consideration by 
considering advertising expenditures for the pre. 
vious six months, while ail other variables were 
assumed to be acting at the time of the research. 
This was a guess. It was found that the correlation 
of brand sales with the previous year’s advertising 
was slightly higher than with the data of the shorter 
period, but the improvement was not significant. 
The varying time lag of different variables is cer- 
tainly one of the most important matters of concern 
to marketing directors yet little or no research has 
been devoted to it. 

I have said much of the limitations of the re. 
gression model but little of its value. I believe it 
offers real advantages. For relatively little expendi- 
ture, a substantial amount of material can be 
collected and evaluated. The simple correlations 
between market shares and the independent mar- 
keting variables will give a picture of the marketing 
strategies being used for the brands of a given 
product class, plus relationships between consumers’ 
appreciation of the qualities of brands and external 
marketing variables. Finally, the findings of the 
multiple regression analysis can be considered a first 
approximation of the relative importance of these 
marketing variables—especially if more faith is put 
in findings of no effect than in findings of much 
effect. 
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As an instance of the nonsense or spurious correlation that is a real statistical fact 


someone has gleefully pointed to this: There is a close relationship between the 


salaries of Presbyterian ministers in Massachusetts and the price of rum in 


Havana, Which is the cause and which the effect? In other words, are the ministers 


benefiting from the rum trade or supporting it? All right. That’s so far-fetched 


that it is ridiculous at a glance. But watch out for the other applications of post 


hoc logic that differ from this one only in being more subtle. 


—DARRELL HUFF 
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How Reliable Is 
Aided Recall of TV Viewing? 


A. S. C. EHRENBERG 


Research Services Limited 


This British analysis finds claims of viewing television virtu- 
ally identical for recall periods of from one to seven days. 


HREE RECENT LARGE surveys of adult television 
Waning in the United Kingdom have been 
based on a personal interview aided recall technique 
covering the seven days before the interview. The 
seven-day aided recall technique had been devel- 
oped and tested before use by D. R. Aitchison and 
B. Brunning (1959), as described in their report, 
Experimental Work and Method for the Granada 
Viewership Surveys. This work showed that the 
technique gave viewership claims for two, three 
and up to seven days ago which were strictly com- 
parable to those obtained for “yesterday,” and thus 
provided ample evidence at that stage for using 
the seven-day technique in practical work. Recall 
aids were the relevant copies of Radio Times, TV 
Times or equivalent regional publications which 
set out in some detail the television programs 
shown by the British Broadcasting Corporation 
(BBC) and by the independent commercial pro- 
gram contractors (ITV). 

This paper describes additional checks from the 
three full-scale surveys themselves on the com- 
parability of the seven-day and one-day aided 
recall techniques. The need for such further checks 
arose partly because the initial testing, although 
highly controlled, was based on relatively small 
samples, but mainly because this initial testing 
was carried out as an experiment and there exist 
many instances where the results of experimental 
pilot work did not prove reproducible under full- 
scale operating conditions. In particular, it is quite 
possible that the quality of the interviewing staff 
used can seriously affect the reliability of a tech- 
nique such as the present one. 

While considering the seven-day aided recall 
technique in terms of its reliability, it is as well to 
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bear in mind how it compares with the more stand- 
ard one-day recall procedure in other respects: 


1. Costs: Seven days’ viewing information is obtained in 
the course of one interview, while the latter still remains at 
quite a manageable length. The increase in the amount of 
information obtained is thus sevenfold. 

2. Continuous Information Obtained: The continuous 
viewing data which is obtained over all seven days of the 
week for each informant permits analyses of the pattern of 
viewing throughout a week, typified by the duplication and 
intensity-of-viewing tables presented in the published survey 
reports. 

3. Sampling: The completely even coverage of every day 
of the week, together with the sheer numerical balancing 
effects of the longer recall period, reduce the need for in- 
terviewing a completely representative sample on each sep- 
arate day of the survey period. (Absolutely rigorous daily 
spacing of representative samples is difficult and expensive 
to achieve in practice, yet failure to do so can easily intro- 
duce sizable biases in information referring to “yesterday” 
only.) By the same token, random sampling as opposed to 
quota sampling becomes much more feasible, bearing in 
mind that with one-day recall, random sampling can create 
more problems and biases than it avoids. 


METHOD 

The present analysis of the seven-day aided recall 
technique has been based on the data of ITV 
viewing (which accounts generally for some 70 per 
cent of all television viewing in the United King- 
dom), from the three Granada Viewership Surveys 
(Granada TV Network Ltd., 1959-60). 


A. S. C. EHRENBERG is a direc- 
tor of Research Services Limited 
in London. He graduated in 
mathematics from Durham Uni- 
versity in 1947 and for the next 
six years held lecturing and re- 
search appointments at the Uni- 
versities of Durham, Cambridge 
and London. In 1955 he entered 
market research with the Attwood 
Group and joined Research Serv- 
ices in 1959. A council member of 
the Market Research Society, Mr. 
Ehrenberg has published some 
20 papers in journals such as 
Applied Statistics, Biometrika, Commentary, Occupational 
Psychology, British Journal of Psychology and Journal of 
the Science of Foed and Agriculture. 
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TABLE | 
SAMPLE SIZES (ADULTS) 


Granada Viewership Survey Unweighted 
Jan.-March 1959 6,826 


June-Sept. 1959 6,777 
Jan.-March 1960 7,700 


Weighted* 


10,729 
11,376 
10,533 


* To allow for differential sampling fractions by regions. 


While interviewing was spread almost evenly 
over the five weekdays, the number of interviews 
carried out on Saturdays and Sundays was relatively 
low. Since every interview necessarily covers view- 
ing on all the seven days prior, the over-all survey re- 
sults remain unaffected by this as long as two con- 
ditions are fulfilled. On the one hand, the rate of 
interviewing throughout the survey period must be 
suitably controlled (details of the sampling pro- 
cedures are given in the survey reports); on the 
other hand, the seven-day recall technique itself 
must of course be reliable. | 

The lower number of weekend interviews does, 
however, make itself felt in breaking out the results 
for each separate length of recall period, as is 
necessary for the present analysis. For example, 
the proportion of viewing claims for seven days ago 
which referred to weekend viewing was relatively 
low, because of the small number of interviews 
which were made at the next weekend, seven days 
later. Since viewing tends to be heavier at weekends 
than on weekdays, analyses of the raw data by 
length of recall period would tend to contain small 
biases. To eliminate these, the analyses here have 
had to be based on daily averages (i.e., treating 
daily samples as effectively equal) and not on daily 
totals (i.e., giving greater weight to the larger sam- 
ples). The implicit upweighting of the smaller 
samples, especially of the Sunday interviews, does 
however accentuate sampling fluctuations which 
occurred in these subsamples. 

The Granada Viewership Survey data contain 
one other minor complication when subjected to 
the special analyses reported here. Viewing claims 
for various recall periods did not always refer in 
precisely equal numbers to the same calendar dates 
—a factor which was strictly controlled in the initial 
experimental work. (Typically we have what may 
be called tail-end effects, for example, that the first 
day’s viewing measured in each survey was neces- 
sarily by seven-day recall only.) This shortcoming 
may, however, be disregarded because numerically 
it tends to be swamped by the sheer size and dura- 
tion of the survey, while its effect could only be to 
produce unwanted fluctuations or trends in the 
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present analyses rather than to induce spuriously 
the kind of consistency which has actually been 
found. 


RESULTS 


Extensive analyses of the survey data which have 


been made to study the reliability of the seven-day 
recall technique showed a highly consistent pattern, f 


The findings can therefore be presented quite 
briefly. 
Table 2 sets out the average number of half- 


hour segments of evening viewing claimed per in. | 


formant, analysed by length of recall period or 
memory span. The table shows that the amount 
of viewing fluctuates but little with the length of 
recall period and shows little sign of systematic 
trends. The only noticeable deviations, taking all 


three surveys together, appear to be that viewing § 
claims are if anything slightly below average for 
one- and two-day recall and slightly above average }} 


for seven-day recall. 


TABLE 2 


DAILY AVERAGE NUMBER OF HALF-HOUR 
SEGMENTS VIEWED BETWEEN 5 AND 11:30 P.M. 


Granada Viewership Survey 





Jan.-March = June-Sept. — Jan.-March 
Recall Period 1959 1960 
2.8 


1 day h 2.0 
2 days : 1.9 2.7 
3 days ; 22 28 
4 days : 2.1 2.7 
; 2.1 2.8 
2.0 2.8 
2.1 


23 
Average. - 2.1 2.8 


5 days. 
6 days 
7 days 


NOTE: The increase from the first to the third survey shown here 


and in Table 4 is essentially due to the increasing proportion of f 


adults who had television sets capable of receiving ITV. 


In order to check further on thése small devia: } 
tions, and also to see if any other small or possibly § 


counter-balancing trends are hidden by the daily 
averages just described, viewing at specific times 
of the evening was examined in detail. The full 


data—for 13 half-hour segments from five to | 
11.30 p.m. by seven recall periods for three surveys, f 
subanalysed by day of interview—are rather cum 
- bersome, but bearing in mind the high correlation 


in viewing between successive half-hour segments, 
the basic pattern can be well demonstrated by con- 
sidering early evening viewing and one or one: 
and-a-half-hour intervals thereafter. Table 3 there- 


fore shows, for the. three. surveys combined, the [ 
average per cent of informants viewing in aly 
half-hour segment at different times of the evening, } 
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analysed by length of recall period. Thus the first 
figure (9) is the per cent of adults who viewed, 
on average, in any one of the four half-hour seg- 
ments from 5-7 p.m. 


TABLE 3 


DAILY AVERAGE PER CENT OF INFORMANTS 
VIEWING IN ANY HALF-HOUR SEGMENT 


Time of Viewing (p.m.) 


Recall Period 5-7 - 9-10 10-1130 


1 day 27 
2 days 27 
3 days 
4 days 
5 days 
6 days 
7 days 





Average 


NOTE: All three surveys combined. 


There is again a virtual absence of any trend 
with increasing length of recall period at each time 
of viewing, apart from the slight deviations al- 
ready noticed in Table 2. These deviations, which 
are to some extent apparent in all three surveys, 
can now be seen to arise mainly during the early 
evening. Moreover, further scrutiny of the detailed 
data has for example shown that the lower figures 
for one- and two-day recall are mainly due to a 
small, heavily upweighted sample of some 20 Sun- 
day interviews from the second survey (June-Sep- 
tember, 1959). In fact, the analysis of the detailed 
workings makes it clear that the small fluctuations 
which do occur in the data are mainly due to the 
sampling errors of the small, and hence heavily 
upweighted, weekend samples especially from the 
second survey. 

Turning in Table 4 to afternoon viewing, the 
average number of half-hour segments viewed on 
a weekend afternoon show, if anything, an even 


TABLE 4 


DAILY AVERAGE NUMBER OF HALF-HOUR 
SEGMENTS VIEWED BEFORE 5 P.M. ON 
SATURDAYS AND SUNDAYS 


Granada Viewership Survey 


Jan.-March = June-Sept. Jan.-March 
Recall Period 1959 1959 1960 
1 day 48 J 67 
2 days 52 
3 days 61 
4 days 59 
5 days 54 
6 days 51 
7 days 52 





Average 54 
*Small sample fluctuation (see text). 
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higher degree of consistency over the various recall 
periods. 

The only aberrant figure is that for seven-day 
recall in the third survey (January-March, 1960). It 
is caused by the heavy upweighting of some rela- 
tively high viewing claims found in the small 
sub-sample of 70 (regionally weighted) Sunday in- 
terviews and must be dismissed as a sampling 
fluctuation. 


CONCLUSIONS 


This analysis of the data from the three Granada 
Viewership Surveys supports the initial experimen- 
tal work in showing that TV viewing claims ob- 
tained by carefully controlled aided recall inter- 
viewing tend to be identical for recall periods 
ranging from one to seven days. Fluctuations in 
the data with increasing length of recall period are 
small and virtually unsystematic, and can mainly 
be traced to sampling errors from small sub-samples 
which had to be upweighted for the purpose of 
this analysis. 

These stable findings are undoubtedly unusual. 
General experience in market research and allied 
fields leads one to expect more untoward memory 
effects, namely a tendency to remember less with 
increasing memory span and a somewhat contrary 
tendency to condense spuriously the duration of 
earlier time periods. The main reason for the pres- 
ent findings must, one feels, lie in the use of aided 
rather than unaided recall, coupled with the rela- 
tive ease with which highly comprehensive and 
graphic, and indeed “natural”, recall aids could 
be employed. Other sources of explanation may be 
the relatively short memory span involved—one 
week—and the corresponding cyclical period of 
much television programming. 

In conclusion, it must perhaps be stressed that 
the present results cannot strictly pretend to show 
anything more than that the seven-day aided recall 
technique may be made to yield data that are reli- 
able when compared with the more commonly 
used one-day aided recall approach. However, it 
can also be argued that the absence of trends with 
increasing memory span lends its own support to 
the validity of aided recall as such for assessing the 
size and nature of television audiences. 
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Letters 


Comment on Meissner’s “Sales and Advertising of Lettuce” 


Joun C. MALONEY 
Leo Burnett Company 


WOULD CHALLENGE Meissner’s conclusions about 

I the effect of advertising on lettuce sales (this 
Journal, Volume 1, Number 3, March, 1961). He 
implies that the partial correlation coefhicients for 
the various independent variables in his matrix, 
including the advertising variables, are directly in- 
dicative of the causal influence that each such vari- 
able has upon the dependent variable of lettuce 
sales. Thus he says: 
“It appears that in 1950-55 only four variables—price, temper- 
ature, fieldmen, and newspaper advertising—had a significant 
influence on lettuce consumption. Of the remaining six 
variables, it cannot be said with confidence that they had a 
net effect on lettuce consumption. . . .” 

In a very similar study a number of years ago, I 
studied the interrelationships of nineteen different 
independent variables as potential “predictors” or 
“explainers” of wide variations in sales levels be- 
tween 52 different bakery sales branches. The inde- 
pendent variable which produced the highest par- 
tial correlation with branch sales was “ratio of 
average salesman’s earnings to average factory work- 
er’s earnings in the branch area.” It would have 
been fallacious to assume that this ratio was “hav- 
ing an effect upon” the sales successes and failures 
in the branches; it was simply symptomatic of vari- 
ations in labor market. My best predictor in this 
case did not turn out to be a cause of high or low 
sales but a correlate of an indirect cause of low 
sales in the poorer branches. (Incidentally, in the 
study to which I refer the multiple R shrank from 
about .90 to about .70 when the regression weights 
were applied to “holdout” data—data not used to 
compute the regression weights and partial corre- 
lations. Such “shrinkage” will almost inevitably 
occur when multiple regression formulas are thus 
“cross-validated.” Since Meissner did not cross-vali- 
date his findings, his multiple correlation coefficient 


$2 


of .66 is probably a fairly gross overestimate of the | 


real “predictive efficiency” of his formula.) 

Partial correlation coefficients should never be 
interpreted simply as direct indicators of causal 
effect. They do not even tell us how strongly an 
independent variable relates to a dependent vari- 
able. They merely indicate how well a given inde- 
pendent variable adds to the efficiency of explain- 
ing, predicting, or “accounting for’ a dependent 
variable’s variance once the vagaries of certain 
concomitant relationships are taken into account. 

If the six advertising variables in Meissner’s 
matrix are highly correlated with each other, we 
would not expect more than one of these variables 
to produce a substantial partial correlation. Under 
such circumstances one of the advertising variables 
would “speak for all of the advertising variables’ 
in making the best possible “explanation” of let- 
tuce sales. 

As a matter of fact, a variable which has a posi- 
tive causal effect on a dependent variable can often 
end up with a negative partial correlation. When 
this occurs the independent variable is working as 
a “suppressor variable”—and its partial correlation 
depends not upon its relationship to the dependent 
variable so much as it depends upon the suppressor 
variable’s ability to “dampen the nonvalid vari- 
ance” of certain other independent variables. Meiss- 
ner found this phenomenon in his own data with re: 
gard to the “Fieldmen” variable (and perhaps, the 
“Radio spot” and “Radio network” variables). This 
is clearly implied by the following: 

“, . . Thus the regression coefficients for wholesalers, retailers. 
and newspapers are positive. This can largely be traced back 
to fieldmen, because when these three variables are left out 


from the equation, the effect of ‘Fieldmen’ turns out to be 
positive.” (Italics added.) 


What would have happened in this case if an 
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iterative approach (like Wherry-Doolittle) had been 
used, but with the IBM 650 instructed to first 
extract the advertising variables before the remain- 
ing partial correlations and regression weights were 
computed? (This would have been a more defen- 
sible approach to examining the relationship of ad- 
vertising to lettuce sales.) I dare say that at least 


one of the advertising variables would have stood 
out as an important “determinant” of lettuce sales. 
I would applaud Meissner’s acknowledgment that 
a better approach to studying causal relationships 
would employ systematic experiments. Indeed, it 
might be better to forget his regression formula 
altogether. 


Rejoinder 


FRANK MEISSNER 
San Jose State College 


k. MALONEY makes me regret that I did not 
M know of him before the manuscript was sub- 
mitted for publication. He could have been tapped 
to become one of the reviewers of an “earlier ver- 
sion” of the manuscript. I would have gladly given 
him credit. I would certainly have loved to take a 
look at his Bakery Sales Study. It sounds simply 
fascinating. Is it available? 

All I would like to say in my own defense is that 
Maloney under-estimates the meaning of my quali- 
fying sentences. When things “appear” and if “it 
cannot be said with confidence that... ,” then I 
hope to water down whatever was previously said 
about “indications” of causal influence of inde- 
pendent upon dependent variables. 

I do not feel guilty of having interpreted partial 
correlations simply as direct indicators of causal 
effect. 

The Wherry-Doolittle iterative approach (in- 
structing the IBM 650 to extract the advertising 


variables before computing the remaining partial 
correlations and weights) is something that did not 
occur to me. However, I would like to get the exact 
reference, just in case the lettuce study could be 
revised and updated. 

Maloney seconds my plea for systematic experi- 
mentation as a superior way of studying causal 
relationships. I appreciate that support. But I can- 
not agree with Maloney’s defeatism when he sug- 
gests that “it might be better to forget about the 
regression approach.” It proved an interesting ex- 
ercise in the lettuce study, and I do feel that re- 
gression analyses have useful roles to play, provided 
they are not peddled as panaceas. 

I did not attempt to set the world afire by pub- 
lishing the lettuce study. However, I hoped to 
stimulate some constructive discussion. Mr. Ma- 
loney proves that I succeeded in reaching this 
modest goal. 


The public generally . . . are in the habit of seeing public men making use of 
the most opposite statistical results with equal assurance in support of the most 
opposite arguments. ...If the same ingenuity and enthusiasm . . . should 
have tempted . . . historians to group facts also, it would be no more reasonable 
to make the historical facts answerable for the use made of them than it would 
be to make statistics responsible for many an ingenious financial statement. 

—PRINCE ALBERT, at the opening of the Inter- 
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national Statistical Congress, London, 1860 
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FEDERAL STATISTICS IN ADVERTISING 





The 1958 Census of Business 


Incrip C. KILDEGAARD 
ARF Research Statistician 


HE ADVERTISER demands careful estimates of 
5 enlesea sales potentials when defining his mar- 
keting objectives or allocat- 
ing advertising expenditures. 
Too often he has to settle 
for income data or other 
sales-r_lated measures from 
the 1960 Census of Popula- 
tion and Housing. 

Some advertisers can em- 
ploy actual sales data ob- 
tained from the 1958 Census 
of Business. These sales data, 

especially when combined with population and 
housing characteristics, provide an important tool 
for measuring advertising potential and perform- 
ance. 


Consumer Sales Patterns 


The 1958 Census of Business found national re- 
tail sales to be 55 per cent higher than in 1948. 
Rates of growth varied in different parts of the coun- 
try. Retail sales in the western states rose by 73 
per cent in the ten-year period, while in the north 
central states the gain was only 45 per cent. Other 
factors besides population shifts contributed to the 
varying trends: per capita sales in the west were 
about 20 per cent above the national average. Per 
capita apparel sales, on the other hand, were 
highest in the northeastern states. More illuminat- 
ing patterns of consumer spending are evident from 
comparisons of smaller areas or of types of business 
establishments. 
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Eight Censuses 

The Census of Business is one of the eight peri- 
odic Censuses provided by present law. Six of these 
counts, including the one on business, are con- 
ducted every five years by the Bureau of Census. 
Since 1929 there have been seven Censuses of Busi- 
ness. The latest, taken in 1959 for the 1958 calendar 
year, covers the retail, wholesale and service trades. 
For each of these three trade divisions, two series of 
reports have been issued. While study and reportt- 
ing methods for the three trade groups are similar, 
retail trade is of special interest to advertisers of 
consumer products. 

The 1958 Census represents almost 1,800,000 
establishments primarily engaged in retail trades. 
Conducted by mail, establishments with employed 
personnel filled out a form reporting such data as 
type of outlet, number of employees, total payroll, 
sales value of merchandise line, credit sales, ac- 
counts receivable and inventories. Sales of non- 
employee establishments, obtained from 1958 Fed- 
eral Income Tax returns, accounted for less than 
ten per cent of the total sales volume. In the 
tabulations, each establishment was classified on the 
basis of its major activity. 


Area and Subject Reports 

In addition to a United States summary, separate 
area reports are available for each state and terri 
tory. Data are given for each county, standard 
metropolitan statistical area and incorporated city 
over 2500 in population. These include statistics 
by type of business for number of establishments, 
amount of sales, size of payroll and number of paid 
workers. 
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Amount of detail is generally proportional to 
the number of establishments. For states, statistics 
are reported for 95 kinds of business groups. For 
small towns and counties, eleven major kinds of 
business are given. Metropolitan areas and large 
counties cover all business groups unless confiden- 
tial information would be revealed. In New York 
County, for example, only two establishments were 
found belonging to the hay-grain-feed store classi- 
fication. To avoid disclosure, sales and payroll data 
are not reported for this -classification. 

Subject reports provide cross tabulations of the 
retail sales and employment data by sales size, by 
legal form of the organization and by single units 
and multi-units (mostly chain stores). Statistics are 
given by kinds of business for each state and stand- 
ard metropolitan statistical area, the kind-of-busi- 
ness detail depending upon the number of cells 
in the cross tabulations. 


data are tabulated for nearly 500 smaller retail 
centers in the reported areas. 

These reports will be of special interest to those 
who use point-of-purchase advertising or test ad- 
vertising campaigns. Population growth since the 
war has been most marked inside metropolitan 
areas. But this tendency toward urbanization has 
been accompanied by a decentralization within the 
metropolitan area. The intra-city statistics on retail 
trade provide measures of the resulting shifts in 
distribution and sales. Consumer spending in the 
mushrooming shopping areas will attract the ad- 
vertisers’ attention. 

One serious defect of the 1958 Business Census 
is that sales are reported by kind of business rather 
than by merchandise line. Some supermarkets, for 
example, may have a higher sales value of drug items 
than the neighboring drug stores. Recognizing the 
need to unscramble the sales data, the Bureau of 
the Census is planning to isolate about 20 merchan- 





= ve | Intra-City Statistics dise lines in the 1963 Census of Business. 

pag A supplementary series of reports has just been Meanwhile, the most complete data on expendi- 
nei issued for 97 SMSA’s comparing the retail sales tures in the retail, wholesale and service trades are 
Busi ; information by business district. These comparisons from the 1958 Census of Business. Anyone wishing 
ondar @ 2 made between the central city and the re- to receive individual reports or the complete vol- 
nail mainder of the metropolitan area, as well as be- umes may obtain order blanks and price lists from 
rn of tween the central business district and the re- the Superintendent of Documents, U.S. Govern- 
sport mainder of the central city. In addition, summary ment Printing Office, Washington 25, D. C. 
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Fed- The Advertising Research Foundation invites commercial research organizations to submit 
than reports of unpublished data they have collected using new techniques of advertising research. 


| the The author of the best paper will present it to ARF’s Seventh Annual Conference at the 
n the Hotel Commodore in New York October 3-4, 1961. The Conference will be concerned with meet- 
ing demands for better advertising research in the 1960's. 


A jury drawn from ARF’s Technical Committee and Conference Ff. »gram Committee will 
evaluate the reports on the basis of their originality, clarity, soundness » { experimental design, 
and usefulness of findings. 


arate Manuscripts should be no longer than 15 double-spaced typewritten pages. Ten copies should 
terri: be submitted to ARF by July 7, 1961. The name of the author or his organization should not 
tard appear in the manuscript, but only on a 3 x 5 card clipped to one copy. 

darc 


‘ The Judges will select no winning paper if they find none suitable. 
: oF : Entries and inquiries should be directed to Dr. Charles K. Ramond, Technical Director, 
1StICS ARF, 3 East 54th Street, New York 22, New York. 
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Correlation and Regression 


Gwyn COLLINS 


ARF Research Mathematician 


What is the correlation coefficient and what does it 
measure? 


First, there are many correlation coefficients, not 
just one. All are indexes which measure some kind 
of association between two or more sets of figures. 
The best known, most used coefficient of correla- 
tion is usually attributed to Karl Pearson and is 
often called the Pearsonian coefficient of correla- 
tion. It is also sometimes called the product-mo- 
ment correlation coefficient; we shall refer to it here 
as the correlation coefficient. 

Like most other correlation coefficients, it can 
vary anywhere from —1 through 0 to +1. A 
perfect negative relationship is denoted by —1 (i.e. 
one measurement is high when the other is low), 0 
means no relationship and +1 means a_ perfect 
positive relationship. Relationship, that is, of the 
kind measured by this coefficient. But two sets 
of figures can be related in ways which this co- 
efficient will not measure. 

There are many descriptions of the product- 
moment coefficient and the following is as useful as 
any. Suppose we have two measurements for each 
of a group of men, say their height and weight. The 
heights may average about 68 inches, the weights 
about 150 pounds. We try to reduce both sets of 
figures to the same scale. The first step is to measure 
each person’s height and weight as so many inches 
or pounds above or below the average. The average 
of all the new heights and weights are then both 
zero. Then we notice that the weights are more 
spread out than the heights. The weights may range 
from 30 Ibs. below the average to 30 Ibs. above, 
while the heights will all vary within about six 
in. of the average. So we divide each set of figures 
by an index of its spread—its standard deviation. 
Both sets of figures then have the same average and 
the same spread. These figures are often referred to 
as standard scores. 
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The correlation coefficient is easily calculated § 


from them. It is simply the average of the product 
of corresponding standard scores for the two meas. 
urements. 


d ‘ } 
This coefficient measures how nearly scores above 
or below the average for one dimension are in| 
direct proportion to corresponding scores for the | 


other dimension. 


Can I calculate the correlation coefficient for any two J 


sets of figures? 


Yes. The computation of a correlation coefficient 
is purely mechanical. It can be done by hand or by 
machine in many equivalent ways; it is merely a 
way of manipulating numbers. 

Interpreting it is another matter. For this we 
have to know, or assume, a great deal about the 
figures used to compute it. 

The simplest form of the problem of interpreta- 


tion is this. We have two measurements for each | 
of a sample of individuals. The two measurement f 
appear to vary together, at least they correlate posi f 


tively. Could the correlation we have found be- 
tween them occur by chance alone or does it indi- 
cate a real relationship between these dimensions 
throughout the population from which our sample 
of individuals was drawn? 


This question, usually answered with the ust f 
of tables, formulae, charts and talk of standard § 
errors is, in general, beyond solution. To make i § 
at all tractable, we make some radical assumptions. } 

We only have information about a sample o! f 


individuals. We must guess about the kind of 


population from which they came. Usually we as : 


sume that the two dimensions are distributed in 4 
very special way—with bivariate normal distribu: 
tion and zero correlation—in the parent population. 
Then we try to find just how likely it is that the 
sample we have could come from such a popula- 
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tion. This probability depends only on the number 
of individuals in our sample and the correlation 
coeficient we found in it. For this we can read off 
from tables of Student’s t distribution for the 


=- (where r is the correlation 


. \ n 
quantity t =r Pr a 
coeficient found in the sample; n is the number 
of people in it), the probability of getting any par- 


| ticular size correlation coefficient in the sample. 


It is quite possible, of course, that the two sets 
of measurements are not distributed in the way we 
have assumed, in the population from which our 
sample was drawn, though they are correlated 
with one another. Very little is known about the 
amount of correlation we can expect to find in 
samples drawn from populations which are not 
bivariate-normal. Most of the measurements ac- 
tually correlated in market research almost cer- 
tainly do not have bivariate-normal distributions. 
The only consolation known to this writer is that 
E. S. Pearson (1931), Karl’s son, has shown that in 
practice the parent population can vary quite a 
lot from strictly bivariate-normal form and still 


) give rise to sample correlation coefficients of about 


the same size as would a bivariate-normal popula- 
tion. 

Another point to remember in interpreting the 
correlation coefficient is that there are many ways 
in which two sets of figures may be perfectly re- 
lated, yet give rise to a correlation coefficient of 
zero. In general, if we find a correlation coefficient 
in a sample, which would be significantly large if 
the sample came from a bivariate-normal popula- 
tion, then there is really a relationship between the 
two measures. On the other hand, if the correlation 
coefficient is zero, we cannot conclude that the 
two measurements are not related. 


Are there any coefficients for measuring other kinds of 
relationships? 

Yes. There is, for example, the correlation ratio. 
This concept can best be understood by means of 
an example. Suppose once more that we are deal- 
ing with the heights and weights of a large number 
of people. There will be several people in the 
sample of any one height, say 5 ft. 6 in., and in 
general the weights of such people will not be the 
same. We can speak of a spread or array of weights 
around the average for people of that height. 

Now in general we would expect the spread of 
weights for those people who are 5 ft. 6 in. tall to 
be rather less than the spread for a group half of 
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whom are 5 ft. 6 in. tall, and half of whom are 6 
ft. tall. Certainly, if height and weight are related 
we would expect this. For this reason we can ex- 
amine the spread of weights for each separate height, 
and if they turn out to be less than the spread of 
weights of people of all heights, we would know 
that height and weight are related in some way. 

This is how the correlation ratio is derived. We 
calculate the ratio of the average spread around 
the middle of each separate array, to the spread of 
all arrays mixed together around their over-all 
average. If there is a close relationship between the 
two measurements then the ratio will be near to 
zero; if there is little relationship, it will be close 
to 1. If we define the word “spread” to mean the 
standard deviation and subtract the ratio from 1, 
then we have the correlation ratio. 

This varies from 0 to 1. (It is calculated and de- 
fined in such a way that the negative range from 
—1 to 0 is meaningless.) Numerically, it is always 
at least as large as the ordinary correlation co- 
efficient, and it will reflect many kinds of relation- 
ships between two sets of data which are not meas- 
ured by the regular coefficient. 

As with the correlation coefficient. there are some 
tricky problems of interpretation. To determine 
how likely it is that chance alone could, in a sam- 
ple, give rise to a correlation ratio of some fixed 
magnitude when the parent population shows no 
relationship at all, necessitates an assumption. We 
have to assume that each of the arrays in the parent 
population (in our example, of weights for each 
separate height) follows the pattern of the normal 
distribution. If this assumption is true, then the 
probability of getting any particular correlation 
ratio can be found by using tables of the Incom- 
plete Beta Function. 


What is a regression line? 


A predictive tool. It is a line drawn on a graph 
representing our best guess of the relationship be- 
tween two measurements made for each member of 
a sample when one of the measurements has been 
made with imprecision or is subject to other influ- 
ences. Its purpose is to predict what value of one 
measurement we will probably find if we know the 
value of the other. 

There are many ways of drawing it. To under- 
stand them we must start with a scatter diagram. 
This is a graph on which the two axes represent 
the two measurements. On the graph are points 
each of which represents a sample member. The 
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position of each is determined; its distance from 
one axis is determined by one measure, its distance 
from the other axis by the other. The regression 
line is a line, not necessarily straight, drawn 
through the points to “fit” them as well as possible. 

Now there are many kinds of lines that could 
be used: straight, or curved in some special way. 
Similarly, there are several ways of interpreting 
the best “fit.” In short there are many regression 
lines. Of them the simplest are the two straight 
“least squares” lines which are used to predict x 
when y is known, and to predict y when x is known, 
where, of course, x and y are the two measurements 
under consideration. 

The least squares line of regression of y on x 
is the line that makes the sum of the squares of 
the vertical distances from each point to the line 
a minimum. The least squares line of regression 
of x on y does the same thing for the horizontal 
distances. Generally, they will be two differeat lines 
and the more closely related the two measurements 
are, the smaller will be the angle between them. 
The actual arithmetic of fitting the regression line 
is described in any elementary statistical text. Es- 
sentially it consists of calculating five different in- 
dexes from the data and using them to solve two 
simultaneous equations to get the slope of the 
regression line and the point at which it intersects 
the y-axis. These lines can, of course, be represented 
by an equation of the form y =a - bx. Conse- 
quently the problem of finding the regression line 
is sometimes spoken of as finding the regression 
equation. The two expressions are equivalent. 

The calculations performed on the data are very 
similar to those necessary to compute the correla- 
tion coefficient between the two measurements. 
There are, in fact, many interesting points of over- 
lap between correlation analysis and regression 
analysis, though the purposes of the two techniques 
are different. The former assesses the interdepend- 
encies of different measures by estimating the pa- 
rameters of the parent population, while regression 
analysis offers a means of prediction. 

Sometimes it is obvious from the scatter diagram 
that the line which would best fit the data will 
be curved and not straight. In such a case one has 
to decide what kind of curve. Any kind of curve 
can be fitted to data, but there is no general way 
of letting the data decide the kind of curve for you. 


You can try to fit a parabola, for example, and you 
can test how well it works, but your calculations 
will not show how much better or worse some other 
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curve might have fitted. You get out what you put 
in; no more. There is no generalized method of 
selecting the best fitting functional relationship, 
Another important question is whether thie re. 
gression line we find in the sample may have been 
an accident of sampling, or whether it reflects an 
actual relationship in the population from which 
the sample came. Again we have to make some 
assumptions about the population or we are in 


difficulty. The assumptions usually made are that } 
the discrepancies between the predictions and the § 


actual values are normally distributed with the 
predicted value as average; that the variances of all 
these distributions are equal for all values of the 
predictions; that each observation, that is each pair 
of values, is independent of other observations. 
If these assumptions are accepted, then the easiest 
way of testing the significance of the line is to use 


tables of the variance-ratio or F distribution to § 


compare the spread of points around the line to the 


spread of points around their over-all average. If [ 


the population does not conform to these assump- 


tions, then the F test is inapplicable and will mis- | 


lead. To use it the two quantities whose variances 


are compared must both be normally distributed | 


and independent of one another. 


What is multiple regression? 


This, too, is a predictive tool and the regression 


line that we have just described is a special case 


of it. 


As with the regression line the problem is to 


fit an equation to a set of measurements. With mul- 
tiple regression, however, the problem is to fit not 
a line, or an equation representing a line, but a 
surface or an equation representing one. In practice 
this means that we have measurements of several 
different dimensions for each of a sample of indi- 
viduals. We wish, in the future, to predict the 
measurement of one of these dimensions, for one of 
more other individuals, using a knowledge of the 
measurements of the other dimensions. We usually 


refer to the dimension to be predicted as the de- § 
pendent variable and to those dimensions we use t0 f 


make the prediction as independent variables. The 
problem is how to find the best combination of 
independent variables with which to predict the 
dependent variable. 

As with the determination of the regression line, 
we must first use some judgment about the kind 
of equation to try. Most often we try a simple 
linear combination of the independent variables. 
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The equation will then be: y=a- bx; + cx, 
4+dx;-+ ...-+72X,, where y is the dependent 
variable; x,, X2, Xg, - - - Xp, are independent vari- 
ables, and a, b, c, d, etc., are regression coefficients 
to be determined. 

This decided, the arithmetic operations necessary 
to compute the regression equation are extremely 
simple though tedious, the tedium increasing 
rapidly with the number of independent variables 
used. As with the regression line, the task con- 
sists of deriving certain indexes from the data 
and solving a set of simultaneous equations to find 
the regression coefficients. The method is described 
in most statistical texts. The labor of computation 
with a large number of independent variables is, 
however, indescribable. Fortunately, the indexes 
are very quickly computed on an electronic com- 
puter. As for the solution of the simultaneous equa- 
tions, this is a field where the computer puts all 
human efforts to shame. In fact, so speedy and 
straightforward is this job of calculating regression 
equations on a computer, that regression analysis 
has now become very fashionable. A few years ago 
many of the regression equations now reported 
would have been almost impossible to calculate. 

Electronic computers have done so much to stim- 
ulate regression analysis that there is nowadays a 
tendency to take problems involving only a small 
number of independent variables to them for solu- 


tion. There are, however, ways of dealing with | 


these smaller problems by head and hand. This 
writer would guess that with, say, a sample of 300 
individuals, the time to go to an electronic com- 
puter would be when the number of independent 
variables used was more than 12. 

One more point about multiple regression. The 
same assumptions are used to justify this method 
as for the regression line. In particular, the inde- 
pendent variables should not be thought of as 
having some kind of probability distribution; they 
are fixed values. The dependent variable on the 
other hand does have a probability distribution. 
It is this assumption that distinguishes regression 
analysis from correlation and factor analysis. In 
mathematical language, regression analysis is a 


| Wivariate problem; factor analysis is a multivari- 


ate problem. 


What are the multiple, partial and multiple-partial cor- 
relation coefficients? 


The multiple correlation coefficient, usually de- 
noted by R, measures how well the multiple regres- 
sion equation works. 
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In the regression equation, y =a + bx, + cx, 
+... ZX,, one side represents the dependent vari- 
able y, the other represents a linear combination 
of the independent variables, each multiplied by 
its regression coefficient. Since the equation is not 
perfect but only the best that we can fit, there will, 
in practice, be discrepancies between the two sides 
of the equation. The multiple correlation coefficient 
is the product-moment correlation coefficient be- 
tween the numbers predicted on the right hand 
side of the equation, and the actual numbers ob- 
tained on the left hand side of it. 

Because of the way it is derived it is never 
thought of as being less than zero. Nor can it be 
greater than +1, since the product-moment correla- 
tion coefficient cannot be larger than +1. 

Interpreting it is difficult. The usual question is 
whether the multiple correlation coefficient which 
we have found from a sample of individuals got there 
by chance or whether it reflects a relationship in 
the population generally. Again we can only answer 
the question easily by assuming the general popula- 
tion to be multivariate-normal with no relation- 
ship between the dependent variables and any com- 
bination of the independent variables. ‘To find how 
likely it is that such a population could give rise 
to a sample correlation coefficient of any particular 
size, we can again consult tables of the Incomplete 
Beta Function, for the coefficient R is mathemati- 
cally very similar to the correlation ratio. 

The partial correlation coefficient is also one of 
the sidelights of multiple regression. It measures 
the association between the dependent variable and 
one of the independent variables when all the 
other independent variables are taken into account. 
While this sounds complicated, it can be computed 
from the ordinary (or as they are sometimes called, 
the zero-order) correlation coefficients between the 
dependent variable and each of the independent 
variables and between all pairs of the independent 
variables. 

After it has been computed there remains the 
question of what it means. It behaves in the same 
way as the ordinary correlation coefficient and can 
only be dealt with reasonably if the assumption is 
made that the parent population is of the multi- 
variate normal kind. If it is, then the same tests 
can be applied as to the ‘ordinary correlation co- 
efficient; if it is not then the researcher is on the 
new frontier. 

The multiple-partial correlation coefficient is a 
newcomer. Introduced by Cowden, it describes the 
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correlation between the dependent variable and 
a group of independent variables with the effect of 
all other independent variables taken into account. 
Its method of computation is described by Cowden 
(1952) and Cowden & Croxton (1960). Its distribu- 
tion does not seem to have been studied in the 
literature but it is obvious that unless the parent 
population is assumed to be multivariate normal, 
little can be said about it. If it is, then the coefh- 
cient will be distributed in the same way as the 
ordinary correlation coefficient. 


How sure are predictions made from regression equa- 
tions? 


It depends on what values of the independent 
variables we start with and how well the regression 
equations fit the data. 

To make the prediction from the independent 
variable with a simple regression line, for example, 
we merely use the line to read off the appropriate 
value of the dependent variable. In the figure, 
for instance, the value Y, of the dependent variable 
corresponds to the value Xp» of the independent 
variable. Yo, however, is only the most probable 
value for the prediction. The actual value of the 
dependent variable may turn out to be either more 
or less than it. If we assume that we are dealing 
with a population satisfying the regression assump- 
tion, then we can say something about the range 
around Y, in which the true value will fall. 

We usually describe this kind of range by saying 
that we want to set limits on Yo so that we are sure, 
say nine chances in ten, that the true value will 
fall within them. The size of this range will vary 
for different values of the independent variable. It 
is conveniently described by the shaded area in 
the figure. The two curved lines which enclose 
the regression line are hyperbolas and once calcu- 
lated, can be used to establish a prediction interval 
for any value of the independent variable. The 
method is almost self-explanatory. For Xo in the 
figure, for example, Y, and Y, set the range of the 


PREDICTION LIMITS 
OF A REGRESSION LINE 














Xo 


predictive interval. The hyperbolas, however, are 
not generally calculated; instead the prediction 
interval is established for any particular level of the 
independent variable. 


The hyperbolas are symmetrical in their vertical } 


distances from the regression line. The distance, 
however, is a function of two things. First if we 


want to set prediction limits so that we will be right } 
99 times in 100, then the hyperbolas wil! be pushed | 
out from the line to give us a wider range of error. F-. | 
jand | 


In short the distance depends on the range of 
tolerable error. Second, the hyperbolas will be close 
to the line if the equation fits the data well, they 
will be far out if the equation is a poor fit. 
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I have a great subject [correlation] to write upon, but, feel keenly my literary inca- 
pacity to make it easily intelligible without sacrificing accuracy and thoroughness. 


—Sir FRANCIS GALTON 
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RESEARCH IN REVIEW 


Gwyn Couuins, THORNTON C. Lockwoop and GEorGE M. SHIREY 
ARF Staff 


Long Range Effects 
of TV Viewing 


Hilde T. Himmelweit, A. N. Oppenheim and Pamela 
Vince. Television and the Child. London: Oxford 
University Press, 1958. 


W. A. Belson. Effects of Television on the Interests and 
Initiative of Adult Viewers in Greater London. The 
British Journal of Psychology, Vol. 50, Part 2, May 
1959, pp. 145-158. 


W. A. Belson. Television and Family Life. Advancement 
of Science, Vol. 16, No. 64, 1960, pp. 349-353. 


These are reports of three relatively recent stud- 


Although the longest paper is but 58 pages, each 
answers in concise terms and with little statistics, 
some questions long argued by all communications 
experts and parents alike. 

For those who have spent little time in Britain, 
we recommend they start with W. A. Belson’s Tele- 


| vision and Family Life, for this paper gives some 


insight into British life. We learn, for example, that 
for half of the families studied, the only viewing 
room is also the only family room heated in the 
winter; anyone who has spent a winter in Britain 
will realize the importance of this factor. In almost 
all instances television was viewed with some light 
in the room, thus enabling other activities to be 
carried out. A few selected examples of a viewing 
room scene are given by Dr. Belson, although he 
points out that each situation varies considerably. 

One aspect of family life investigated in. the 


Belson study was how television affects parents’ 


relationships with their children. He found many 
parents had trouble getting children to bed, that 


| there was some rebelliousness about what would be 
b viewed, that television was used by the parent as a 


bribe and as a personal escape from the attention 
of the child. 
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With these findings in mind and some conjecture 
of family life in Great Britain, we can now turn 
to an earlier and more complete report of the 
effects of television on the child by Himmelweit, 
Oppenheim and Vince. In this study, which was 
sponsored by the Nuffield Foundation, it was found 
that viewing caused only a slight delay in the child’s 
bedtime, not more than 20 minutes, and in the case 
of the nontelevision owners (control group), chil- 
dren spent more time playing or reading in bed. 
The Nuffield study showed little differences in ef- 
fective sleeping time between viewers and non- 
viewers. As in the Belson study, it was found that 
television is used as an instrument of discipline and 
that conflicts do occur on program selection. 

The prime concern was, of course, the impact 
of television on children and young people. The 
research design compared two groups: one of chil- 
dren exposed to television in their homes, and one 
not exposed to television in their homes. Each 
viewer was matched individually with a “twin” 
control child in terms of age, sex, intelligence, and 
social background, and was selected, as far as pos- 
sible, from the same classroom. The main survey 
was carried out in four English cities, where 4,506 
children were interviewed. Of these, 1,854 matched 
viewers and controls were selected. 

In addition, a before-and-after study was carried 
out in a city where television was just being intro- 
duced. The purpose of this smaller study was to 
determine what differences, if any, existed between 
those households which bought a television set and 
those which did not. This study concentrated on 
two age groups, 10 to 11 and 13 to 14 year olds. 

Happily, the main study shows that Britain is not 
raising a future generation of “Flowerpot Men” 
and we therefore may assume that the U.S.A. is not 
raising a generation of “Yogi Bears.” On the aver- 
age, children in Great Britain view television about 
two hours a day, and although they become more 
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critical and less attached after three years of owner- 
ship, they only reduce their viewing by about two 
hours a week in that time. This would indicate 
that television does become a habit. Favorite pro- 
grams reported by the viewers were mostly adult 
programs, particularly crime thrillers and to some 
degree comedies, variety programs and family se- 
rials. 

Television does not seem to help or hinder most 
children in acquiring general knowledge. It does, 
however, have an influence on younger children 
who have not yet learned to read. For such children 
it seems to produce a gain equivalent to four or 
five months in intellectual development. In school 
the viewer seems no more listless, nor does his 
power of concentration seem less than in non- 
viewers. Television does, at first, decrease the ratio 
of books to comics read but, after a few years, view- 
ers read as many books as nonviewers. Apparently 
television actually stimulates, rather than hinders, 
reading after the first few years of set ownership. 

Intelligence emerged as the most important fac- 
tor in determining viewing patterns. The more 
intelligent the child, the less his interest in tele- 
vision and the less he was inclined to watch it. At 
adolescence television apparently becomes less im- 
portant for most children. 

Drs. Himmelweit and Oppenheim and Miss Vince 
conclude their booklet with one chapter on prin- 
ciples and generalizations and another on implica- 
tions and suggestions. In these, different sections 
are directed to special groups interested in tele- 
vision: the sociologist, the communications expert, 
the program planner and the parent. 

The third paper reported here, another by W. A. 
Belson, is of interest not only because it treats 
another type of audience, thus rounding out the 
picture of television in Great Britain, but also 
because of the unusual research technique it em- 
ployed. In the previous study, a control group of 
nonviewers was used. In addition, a smaller study 
was undertaken using the before-and-after tech- 
nique to determine if any differences existed be- 
tween those purchasing a set and those not pur- 
chasing a set. 

Dr. Belson, aware of the problems of matching 
two groups and the possibility of inherent differ- 
ences between set owners and nonset owners, used 
what he calls “the stable correlate method.” In this 
method, nonviewers’ answers were adjusted by 
weighting them by a set of matched variables which 
proved to be predictors of the behavior under 
study. In his study Dr. Belson tested some 136 
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predictors (occupauvunal background, type of fam, 
ily, etc.) from which he then selected three which 
proved to be as good predictors as all others con, 
bined. The adjusted score of the nonviewer wa; 
then compared with the unadjusted score of the 
viewer. 

Belson used two measures in determining interest 
in 50 activities such as home decorating, movie. 
going, bird-watching, etc. One was a simple siy. 


point scale of degree of interest toward the activity, } 


the other a measure of the respondent’s participation 
in it. Acts of initiative were measured in terms of 
frequency of occurrence. Such an act, as Belson 
defines it, was something “done off your own bat.” 
Briefly, the study showed that viewers’ interest in 
terms of strength of interest and activity were 
reduced. Activity associated with viewers’ interest 
proved to be reduced sharply in the first year, were 
maintained in the second year and returned to 
pretelevision level after the fourth year. Television, 
it seems, has made interest more passive; the loss 
of participation is greater than that of interest. 

All of these reports have been described to some 
extent in various media, and by Jack Gould in the 
New York Times. This reviewer unhesitatingly 
recommends these studies; each is relatively short, 
extremely well written and will interest every 
thoughtful reader. 


—GEORGE M. SHIREY Ff 


A Reminder 
About Forgetting 


James P. Wood. Advertising and the Soul’s Belly, Repe- 
tition and Memory in Advertising. Athens: Univer. 
sity of Georgia Press, 1961, $3.50. 


Any irresolution induced by this title will not 
be overcome on the inside title page. Even the 
table of contents, despite sub-titles under chapter 
headings, provides an inadequate clue to the con- 
tents. The cover-to-cover reading to which most 
readers will surely succumb, comes from the flavor 
of the writing, best transmitted by a few excerpts. 


On the generality of laboratory tests: The findings of Eb- 
binghaus and others have about as much universality as the 
historical fact that I ate anchovy paste with my breakfast 
toast this morning. : ; 

On the Ebbinghaus curve and law: Because of its attractive 
clarity and definiteness, the Ebbinghaus Curve and Law, and 
the comparable findings announced by his successors and 
imitators stand, for too many, as the law of the Medes and 
Persians. 

On the motivational level of most readers: Despite what 
an advertiser may claim, there are usually no transporting 
rewards to be gained from reading and learning an adver- 
tisement and no dire punishments if the reader fails to not 
and remember. 
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On repeat exposure: Repeat exposure may act like a golf 
club used to probe a water hazard in search of a lost ball. 
Usually, it stirs up the mud and fails to strike the ball which, 
like most memories, is sunk deep in oblivion. 

On the utility of frequency and continuity: The question 
as to what constitutes useful frequency and continuity is not 
entirely unrelated to how high is up and how far down is 
really sideways. 


On advertising and science: Advertising is not a science. 
Here and there it impinges on one of the social sciences, 
usually with stultifying effect. They pass each other like 
thieves in the night, sometimes with a tangential side-swipe, 
but the meeting is seldom firm enough for contagion. 


The author’s observations are discerning and 
incisive. What is lacking in detail is more than 
made up for in clarity of style and breadth of treat- 
ment. This interpretation of other writers makes 
reference to the author’s 75-item bibliography hard 
to resist. 

The book is divided into six chapters, the first 
of which considers perception as a necessary pre- 
requisite to memory. 

Chapter Two provides an introduction to mem- 
ory in terms of philosophical definitions, early psy- 
chological definitions, and the classical experiments 
on rote learning conducted by Ebbinghaus and 
others. Simply but clearly, the author explains such 
basic concepts as the number of repetitions needed 
to produce criterion learning, the rate of decline 
of subsequent recall over time, the effect of intro- 
ducing association, meaning, rhyme and rhythm, 
the use of such learning materials as nonsense 
syllables, verses, digits and motor skills, and the 
effect of introducing such variables as sleep, moti- 
vation, and visual or auditory variations. The im- 
pact of Thorndike on education, and Strong, Scott 
and others on advertising is summarized. 

Chapter Three demolishes delightfully the gen- 
erality of classical experiments, and introduces the 
concept of memory as a dynamic entity. 

The next chapter expands upon the role of the 
pre-conscious, unconscious and neurological influ- 
ences on memory. 

The concepts of frequency and continuity are 
discussed in Chapter Five, together with some of 
the attempts to measure the value of each. 

He concludes in Chapter Six by discussing the 
qualitative values of magazines and their audiences 
in fulfilling advertising objectives. 

The author introduces his work modestly: “The 
monograph adds nothing to the sum of human 
knowledge.” He concludes: “The reader brings all 
of this (his experiences and memories) to his pres- 


€nt perception of that to which he is exposed. That 


present perception becomes a part of all he is, and 
he brings a new total to a new perception when the 
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exposure is repeated.” Between these statements 
the author fulfills his major objective: “to bring 
together and attempt to evaluate some of the more 
significant findings about memory and repetition 
and relate these to advertising.” 

This book is highly recommended for those in 
advertising who have forgotten that the classical 
literature on memory gave important guidance to 
present advertising practice. 


—Tuornton C, Lockwoop 


Early Answers 
from Product Tests 


Benjamin Lipstein. Tests for Test Marketing. Harvard 
Business Review, Vol. 39, No. 2, March/April 1961, 
pp. 74-77. 

In theory, test marketing is a valuable tool to 
determine whether or not a new product should be 
manufactured and distributed nationally. Its value 
is sometimes limited, however, by the fact that 
some tentative decision must be made before the 
test is complete. Dr. Lipstein discusses in this 
management memo how to use brand share data 
from store audits and consumer panels to make a 
preliminary market evaluation. 

Usually one of the following two brand share 
patterns turn up in a test market. In the “typical” 
situation the brand share of the new product in- 
creases quickly, then falls off sharply to some sta- 
bilized level. In another common situation, the 
brand share increases only gradually, then tapers 
off to some fixed level. ‘The first step in determining 
which of these patterns a tested product may take, 
Dr. Lipstein suggests, is to examine brand share by 
its components: new tryers and repeat buyers. In 
other words, how much of the brand share is made 
up of new tryers attracted by heavy promotion? 
How many buy the new product the second time 
and how does the new product’s share of repeat 
buyers compare with those of competitive brands? 
With this information some picture of the market 
can be formulated and a tentative decision to ex- 
tend the test to other markets or discontinue the 
product can be made. Dr. Lipstein shows how the 
four basic sets of data derived from store audits and 
consumer panels—brand shares, new tryers, repeat 
buying rates and hard-core buyers—can inform 
management where new triers are coming from 
and what is happening to those who fail to repeat 
their purchases. 
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This article is, in a sense, introductory to other 
works in which Dr. Lipstein has been engaged. His 
interest in the market as a dynamic process con- 
trasts with customary descriptions of market shares 
in static terms, and reflects the most important cur- 
rent trend in statistical theory. In this lucid article 
he has helped managers to see this new direction. 

—GEorGE M. SHIREY 


Low Response 


Risks Bias 


W. F. F. Kemsley. Interviewer Variability and a Budget 
Survey. Applied Statistics, Vol. 9, No. 2, June 1960, 
pp- 122-128. 


Sources of error in a personal interview survey 
are of two kinds: those due to sampling which can, 
in suitable cases, be calculated and those due to 
other influences whose magnitude is not ordinarily 
assessed. Such infiuences include question bias, se- 
lective completion and interviewer variability. ‘This 
article is an excellent analysis of this latter source 
of bias, supplementing another published in 1956 
by Gray who, like Kemsley, was with the British 
Government Social Survey. Gray’s research sug- 
gested that interviewer variability increases when 
questionnaire items are poorly defined, or when 
answers reflect attitudes rather than facts. This 
topic is of particular importance since analyses of 
this phenomenon rarely find their way into the 
published literature. In many research reports in- 
terviewer training, and checking precautions used 
to reduce interviewer variability, are glossed over 
lightly if reported at all. 

The design of the survey provided for dividing 
a pre-selected list of addresses for 24 of 128 primary 
sampling units into two equivalent, interpenetrat- 
ing samples, thus controlling variation due to loca- 
tion and time. The survey was concerned with 
budget data recorded by the respondent, and the 
interviewer was required to deliver, explain, check 
and collect the respondent’s record book. Because 
of the interviewer's limited participation in the sur- 
vey, little interviewer variability was anticipated. 

The author divided his 24 interviewer-pairs into 
four equal groups of six as follows. First, he di- 
vided the 24 into two equal groups such that in 
one, both interviewers had a completion rate above 
50 per cent, while in the other, one or both had a 
rate below 50. Each of these two groups was then 
sub-divided to provide two groups, one with a high 
and the other with a low spread between comple- 
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tion rates of interviewers in the pair. While the au- 
thor states that no particular significance should be 
attached to the precise boundaries of the four 
groups, we may speculate whether more or {ewer 
differences would have resulted had the interviewer. 
pairs been distributed on some other basis, such as 
the average completion rate of each pair, the com. 
pletion rate of the better or poorer interviewer in 
the pair, or the difference between completion rates 
of paired interviewers. The results of such analyses 
would have been particularly interesting since the 
author’s analysis indicated that the categories of 
rented /owner-occupied accommodations, number of 
rooms, occupation of household head, and total ex- 
penditures, tended to have most variance in the 
low response, high spread group of interviewer 
pairs. 

The author concludes: “It seems that the sample 
achieved by someone with a poor response is not a 
representative selection of the pre-selected list on 
which the interviewer is working, but is biased in 
a number of directions. While there is no reason to 
believe that this bias is caused by any deliberate 
or conscious act of the interviewer, nevertheless the 
effect is rather as though a biased sample had been 
used. A low response will tend, therefore, to make 
ineffective much of the work put into designing and 
selecting a probability sample.” 


—THORNTON C. Lock’voop 


For and Against 
Motivation Research 


Franklin B. Evans. Psychological and Objective Factors 
in the Prediction of Brand Choice: Ford versus 
Chevrolet. Journal of Business, Vol. 32, No. 4, Oc 
tober 1959, pp. 340-369. 


Charles Winick. The Relationship Among Personality 
Needs, Objective Factors, and Brand Choice: A Re- 
examination. Journal of Business, Vol. 34, No. 1, 
January 1961, pp. 61-66. 

These papers dispute the value of motivation 
research. 

The earlier paper by Dr. Evans discusses, with 
thinly veiled partiality, the differences between 
motivation research and more traditional methods. 
It then describes a small scale study designed to 
test how well some of the instruments of motivation 
research work. In his study, Evans applied a part ol 
the Edwards Personal Preference Scale to a random 
sample of 140, 1955-1958 Ford and Chevrolet own- 
ers in Park Forest, Illinois. This gave him scores 
for each car owner on ten “personality needs.” He 
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showed that no simple combination of these scores 
was very useful in discriminating between the Ford 
and Chevrolet owners. Moreover, 18 psychologists 
given the scores and a protocol describing Ford 
and Chevrolet owners in personality terms, did no 
better. 

In the second part of the study, Evans asked his 
car owners to judge whether each of 21 brief per- 
sonal descriptions was more appropriate to a Ford 
or a Chevrolet owner. The 21 descriptions were 
drawn from the explanation of personality needs 
described in the Edwards Personal Preference Scale. 
He found that many characteristics the Ford own- 
ers thought appropriate to Ford owners, Chevrolet 
owners thought appropriate to Chevrolet owners. 
A further analysis showed that each car owner 
tended to think a person with his personality needs, 
would own a car like his own. 

Now this study may not invalidate the techniques 
of motivation research at all. But it is clearly not 
designed to further their cause. Despite the slow 
pace of academic controversy, we already have a 
paper by Dr. Winick, who attempts to rebut Dr. 
Evans point by point. 

Winick’s most telling criticism is that Evans has 
lumped together the 1955, ’56, ’57 and 58 Chev- 
rolets as though they were a single model and, of 
course, he has done the same thing with Ford. It 
seems that each model year might be more profit- 
ably regarded as unrelated to other model years. 
If correct, this disposes of the central argument in 
Evans’ paper. 

Winick also comments on some other points 
which we can assume he himself regards as :rela- 
tively unimportant. There is, for example, a refer- 
ence to the demographically homogeneous nature 
of Park Forest which Winick believes restricted the 
predictive power of the psychological variables 
rather than enhanced it. Then at the end of the 
paper there is the unusual suggestion that a multi- 
ple regression analysis should have been used 
rather than the linear discriminant function used 
by Evans, though here they are almost the same 
thing. 

Dr. Evans, wishing to test the illusive techniques 
of motivation research, has missed the mark some- 
what, and Dr. Winick apart from one thoroughly 
good point of rebuttal, trails off into some not too 
revelant criticism. But these two papers together 
will constitute one of the livelier stretches in any 
future anthology of the pros and cons of motivation 
research, 

—Gwyn COLLins 
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On Being 
Too Agreeable 


Arthur Couch and Kenneth Keniston. Yeasayers and 
Naysayers: Agreeing Response Set as a Personality 
Variable. Journal of Abnormal and Social Psychol- 
ogy, Vol. 60, No. 2, March 1960, pp. 151-174. 

This article reports an investigation which 
sought to find whether a tendency to agree or dis- 
agree with questionnaire items, regardless of their 
content, is related to other personality factors. 

The investigators first constructed an Over-all 
Agreement Scale. This was extracted from a variety 
of personality tests and other items and was de- 
signed to be “psychologically balanced.” That is, 
the scale contained equal numbers of items on op- 
posite ends of each of the dimensions measured. 

Correlations between this scale and eight per- 
sonality tests led the authors to postulate definite 
tendencies, associated with certain personality fac- 
tors, to answer questions positively rather than nega- 
tively. This tendency, which the authors called 
“Yeasaying,’ is apparently unrelated to authori- 
tarianism, but is related to impulsivity. 

In the second part of the study the authors tested 
the hypothesis that a strong tendency to agree or 
disagree was associated with specific personality 
syndromes. They predicted that there would be 
differences between extreme Yeasayers and Nay- 
sayers (the opposite numbers of Yeasayers) in seven 
major areas which they define clinically. 

This hypothesis was confirmed: “In conclusion, 
we feel that this integrated study, though limited 
by the size and nature of our population, has dem- 
onstrated both the far-reaching importance of re- 
sponse set in the area of psychological tests and the 
major proposition that the agreeing response tend- 
ency is based on a central personality syndrome.” 

The authors of this article have reported their 
study with proper scientific detachment and have 
not dwelt on applications of their findings. But by 
this time several questions will have occurred to 
most readers. ‘They will ask, for instance, whether 
Yeasaying is going to be just another interesting 
psychological phenomenon and no 
whether it is related to the question the researcher 
asks or to something in which the advertiser is 
interested. 

So many researchers’ questions depend upon 
simple agreement or disagreement from the re- 
spondent, that Yeasaying may be an important 
factor on even the simplest questionnaire. 


more, or 


—THORNTON C. Lockwoop 
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Some Surprises 
About Bigots 


Milton Rokeach. The Opened and Closed Mind. New 
York: Basic Books, Inc., 1960. $7.50. 


During the Second World War a great deal of 
interest was aroused in the anti-Semitic personality 
and more generally in the kind of person who 
could support or administer an authoritarian re- 
gime. This interest found its culmination in the 
book, The Authoritarian Personality and left an 
enduring legacy in a questionnaire, the “F” (for 
Fascism) Scale. This questionnaire was said to be 
able to identify a certain kind of authoritarian. 
Professor Rokeach and his colleagues have taken 
this work as their starting point and in the volume 
under review, part of which has appeared in pre- 
viously published papers, he develops the notion of 
authoritarianism of a more general kind which he 
refers to as dogmatism. 

Though Dr. Rokeach is no scourge to tautology, 
he does have empirical findings to present. Perhaps 
the value of any research can be measured by its 
capacity to surprise us and some of the data and 
discussion he presents is, here and there, mildly 
surprising. Moreover, his comments on his data 
readily suggest some further hypotheses of great in- 
terest to the advertising researcher. 

In introducing his ideas Dr. Rokeach discusses 
bigotry generally. He expects to find bigots wholly 
intolerant of opposing schools of thought. He sug- 
gests, for instance, that the Communist is likely to 
be quite as intolerant of opposition, quite as “au- 
thoritarian” in his way, as the Fascist. Certain 
qualities, he argues, are the common properties of 
all extremists. Such people tend, for example, to 
exaggerate the differences and minimize the similar- 
ities between themselves and their opponents; they 
are more able to hold inconsistent beliefs; they 
differentiate less between the beliefs to which they 
are opposed; they are less adjusted to the present, 
and their lives are guided rather more by their 
hopes for the future or by their respect for the long 


past. 

With these expectations in mind, Rokeach has 
developed two scales, one for Dogmatism; the 
other of Opinionation. These scales were applied to 
various groups of respondents in New York, the 
Midwest and in England. In England, for ex- 
ample, the scales placed various political groups 


on a continuum, not from Left to Right, but on a 
continuum of bigotry which has much appeal to 
our intuition. This and other evidence he presents 
to show that these scales measure a dimension of 
behavior known to common sense, but independent 
of any particular ideological viewpoint. He hy. 
pothesizes that dogmatism may be evidenced not 
only in political and religious fields but more gen- 
erally in every area of opinion. 

This dimension is not used simply as a descrip. 
tive device; Dr. Rokeach develops it as a predictive 
instrument. He first argues, and then demonstrates, 
that persons scoring high on the Dogmatism Scale 
will have some difficulty in synthesizing new infor- 
mation into that in which they already believe. His 
demonstration is provided by a series of experi- 
ments with interesting problem situations. Respond- 
ents are asked to think aloud as they deal with 
each problem and Rokeach discriminates between 
the stages of analysis and synthesis in their solution. 
He predicts that respondents long on dogmatism, 
though not distinguished for their difficulty in the 
analytic phase, will have difficulty in synthesizing 
the solution. His hypothesis seems to be consistently 
confirmed. 

He finds, too, his dogmatic respondents far more 
reluctant to break away from opinions which they 
regard as authoritative; they defect much less easily 
from a party line. 

This is but a poor sampling of the many ideas 
that Dr. Rokeach presents and supports. His book 
is pleasantly free from circular arguments about 
group affiliations, findings which do no more than 
illustrate definitions, and other sociological pad- 
ding. More than this he suggests many new hypoth- 
eses. 

One wonders whether the dogmatic pers:. is 
the same person who tends to occupy extreme posi- 
tions on attitude scales; whether he is the person 
who, persistent in his beliefs about a product, is 
brand loyal; whether he is relatively immune to 
advertising persuasion of most kinds. 

This book offers a thoughtful discussion of sev- 
eral research hypotheses supported by some small 
scale findings. It is clear, too, that the dimension it 
explores is of central importance to the advertiser. 
The advertising man will appreciate the author's 
successful attempt to make this book intelligible 
to a wider circle than that of the author’s profes 
sional colleagues. 

—Gwyn COLLINns 
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PUBLICATIONS RECEIVED 


Naomi BorETz 
ARF Librarian 


Who uses the Library? Of our visitors so far this 
year, seven were from overseas, 11 were affiliated 
with universities or research 
organizations, nine were 
from advertising agencies, 
five were from advertisers, 
and nine were from media. 

What do they seek? Re- 
cent visitors to the Library: 
have been particularly inter- 
ested in the section devoted 
to the social sciences: moti- 
vation research studies, sub- 
liminal perception, methods derived from. social 
science, criticisms of motivation research, standards 
for research, and theories, techniques, and methods. 

Periodicals we receive in this field include: 


American Journal of Sociology. University of Chicago Press. 
Bi-monthly. Editor: Peter M. Blau. 


American Psychologist. American Psychological Association. 
Monthly. Editor: John C, Darley. 


American Sociological Review. University of Oregon. Bi- 
monthly. Editor: Harry Alpert. 

Behavioral Science. Mental Health Research Institute, Uni- 
versity of Michigan. Quarterly. 

Contemporary Psychology. American Psychological Associa- 
tion. Monthly. Editor: Edward G. Boring. 


International Social Science Journal. UNESCO. Paris. Quar- 
terly. 


Journal of Applied Psychology. American Psychological Asso- 
ciation. Bi-monthly. 


) Journal of Consulting Psychology. American Psychological 


Association. Bi-monthly. Editor-elect: Edward S. Bordin. 


Journal of Experimental Psychology, American Psychological 
Association. Monthly. Editor: Arthur W. Melton. 

Psychological Abstracts. American Psychological Association. 
Bi-monthly. Executive Editor: Horace B. English. 


Psychological Bulletin. American Psychological Association. 
Bi-monthly. 


The following publications have been acquired 
by ARF’s Library since the last issue of this Journal 
(March 1961): 


Belson, W. A. and C. R. Bell, A Bibliography of Papers 
Bearing on the Adequacy of Techniques Used in Survey 
Research. London: The Market Research Society, 1960. 
Je pp. 

Belson, W. A., Effects of Television on the Interests and In- 
‘tative of Adult Viewers in Greater London. Reprint No. 
101. London: Research Techniques Division, London School 
of Economics and Political Science, 1959. 


Bolger Company, The, Media Image Profiles. Chicago: Gen- 


eral Management Publications, 1960-61, 32 pp- 
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British Broadcasting Corporation, The Public and the Pro- 
grammes. A Report on an Audience Research Enquiry. 
London: British Broadcasting Corporation, 1959. 71 pp. 


Buckley, Earl A., How to Increase Sales with Letters. New 
York: McGraw-Hill Book Company, 1961. 182 pp. $5.00. 


Chicago Association of Commerce and Industry, Research Or- 
ganizations and Personnel in Metropolitan Chicago. Chi- 
cago: Chicago Association of Commerce and Industry, 1961. 
49 pp. $3.00. 


Dember, William, Psychology of Perception. New York: 
Henry Holt and Company, Inc., 1960. 402 pp. $6.50. 

Frey, Albert Wesley, Advertising (Third Edition). New York: 
The Ronald Press Company, 1961. 634 pp. $7.50. 


Goldfarb, Nathan, An Introduction to Longitudinal Statis- 
tical Analysis. Glencoe, Illinois: ‘The Free Press, 1960. 220 
pp. $5.00. 


Gulliksen, Harold and Samuel Messick (Editors), Psychologi- 
cal Scaling. New York: John Wiley and Sons, Inc., 1960. 211 
pp. $5.00. 


Holdren, Bob R., The Structure of a Retail Market and the 
Market Behavior of Retail Units. Englewood Cliffs, New 
Jersey: Prentice-Hall, Inc., 1960. 203 pp. $4.50. 


Institute of Practitioners in Advertising, Third National Con- 
ference, 13-16 October, 1960. Speeches and Summaries of 
the Discussions. London: Institute of Practitioners in Ad- 
vertising, 1960. 64 pp. 


Manville, Richard, How to Create and Select Winning Ad- 
vertisements. New York: Harper and Brothers, 1947. 70 pp. 


Meynard, Jean, Better Buying through Consumer Informa- 
tion. Paris: The European Productivity Agency of the Or- 
ganisation for European Economic Co-operation, February 
1961. 136 pp. 


Modern Packaging Magazine, A Supplement to the Ninth in 
a Continuing Series of Studies of Magazine Readership 
and Purchasing Influence in the Packaging Field. New 
York: Breskin Publications. Free. 


Organisation for European Economic Co-operation, Paris: 
The European Productivity Agency of the Organisation 
for European Economic Co-operation. 

The Consumer’s Food-Buying Habits. March 1958. 150 
. $1.50. 
Digests of Marketing and Distribution Publications. Nos. 
I and II, 1961. 
Documentation Service on Research into Distribution. 
1961. 73 pp. 


Paulu, Burton, British Broadcasting in Transition. Minneap- 
olis: University of Minnesota Press, 1961. 250 pp. $5.00. 


Quenouille, M. H., Rapid Statistical Calculation. London: 
Charles Griffin and Company, Ltd., 1959. 44 pp. 10s. 


Report of the Committee on Interstate and Foreign Com- 
merce, Evaluation of Statistical Methods Used in Obtaining 
Broadcast Ratings. Washington, D. C.: U. S. Government 
Printing Office, March 1961. 163 pp. Free. 


Rokeach, Milton, The Open and Closed Mind. New York: 
Basic Books, Inc. 1960. 447 pp. $7.50. 


Schettler, Clarence, Public Opinion in American Society. New 
York: Harper and Brothers, 1960. 534 pp. $7.00. 


Teaching Machines, Inc., Statistical Inference. Volume II of 
a program in statistics. Albuquerque, New Mexico: ‘Teach- 
ing Machines, Inc., 1960. $10.00. 


Wood, James Playsted, Advertising and the Soul’s Belly. Ath- 
ens, Georgia: University of Georgia Press, 1961. 116 pp. 
$3.50. 


Yankelovich, Daniel, Inc., As Their Readers See Them. A 
Study of Apperceptive Values in General-Business and 
News Publications. Business Week Research Report No. 
67. New York: McGraw-Hill Book Company, April 1961. 
Free. 
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Mrs. Mead thought the interview would 
never be over. 

First all those questions about buying 
things, some she’d never heard of, and then 
when she had bought one, that silly business of 
checking the package to see the size. And now 
questions about all these TV programs. 

She left off feeding the baby and glanced at 
the interviewer. 

“Polynesian Eye?” the Interviewer asked for 
the second time. 
“I’m sorry . 

“Polynesian me. You know, the one with the 
two detectives and their Chinese girlfriend. 
Same question: Did-you-watch-it-last-week-or-do- 
you-regularly-watch-it-and-just-happen-to-have- 
missed-it-last-week?” 

Mrs. Mead watched abstractedly as a large 
bead of sweat fell from the interviewer’s nose 
on to her clipboard. It was a hot day. And the 
poor girl had probably been lugging that 
satchel around all morning. 

“Would you like some ice-water?”’ Mrs. 
Mead asked. 

“I'd love some,” the Interviewer said, “but 
couldn’t we get this list finished firsi? Poly- 
nesian Eye. How about Polynesian Eye?” 

“Oh yes, I guess so,’ Mrs. Mead said. The 
sweet thing was so earnest, and it was nice to 
get the set of matched dish towels. And, after 
all, they had seen the Something-or-other Eye 
show last week. Or was it the week before last? 


” 


“Now then,’ the Interviewer said. “Mule 
Train?” 
The baby began to cry. 


Mrs. Mead has just become a statistic. 

As such she will appear sooner or later in the 
upper left-hand corner of a 2 x 2 table relating 
the consumption of products to the consumption of 
media. At the left the two rows will be marked 
“buyers” and “non-buyers.” Across the top, the 
columns will be headed “viewers” and “non-view- 


ers.” 






The Girls Who Can’t Say No 






On a hot June afternoon, with the best inten} 
tions in the world, Mrs. Mead falsely report 
being a viewer-buyer. 

Now a buyer she probably was. The interviewer} 
saw the package. Maybe her mother-in-law haf) 
given it to her when she came over for her regula} Vol 
Wednesday night dinner, but it’s not likely. ee 

And was she a viewer? Maybe, maybe not. lif 
anyone could say, who better than she? But how} 
she answered the viewing question could have been} : 
determined by any number of things, including ' 
her baby’s appetite at that moment. 

It wasn’t that Mrs. Mead didn’t want to give an 
accurate report. Of course she did. But Mrs. Mead 
wanted to do much more. She wanted to be nice 
to the interviewer. She wanted to seem up-to-date 
and knowledgeable about TV. She wanted to ap- 
pear grateful for the premium she received. Most 
of all, she wanted to get the interview over with. 

The worst of it is that the hot weather, the =a 
appetite and the interviewer's disposition probabl) 
had little influence. Mrs. Mead just always answer 
questions like that. Ask her friends: they’ll tell you 
that’s the way she is. 

But this leads to a hair-raising thought. Suppos 
our samples are loaded with Mrs. Meads, nici 
ladies of all ages and incomes, who over-claim 
listening, reading, trying, buying, or doing any: 
think you please, and who really like almost every: 
thing they're asked to rate. Wouldn't they falsel 
strengthen the relationship we observe between an\ 
pair of these activities? And would we ever know 
how much? 

We would not—not, that is, until we could some: 
how measure this tendency to be agreeable and 
thereby identify the Mrs. Meads. Bill Wells has be 
gun to do this, as he reports on the first page of thi 
issue. 

The girls who can’t say no will always be with 
us. We had better isolate their contribution to ou! 
2 x 2 tables and examine it separately. ‘Those uppe! 
left hand corners may be dangerously swollen. — 

—CHARLES K. RAMON) 





JOURNAL Ol 





